Yeah there’s a balance to strike, and there’s sometimes no possible goldilocks zone. Show all detail and it’s too much - information overload; show too little and sometimes important details will be missing.
In this case I think not showing confirmed vs classified is better, because… well firstly we’re talking about sample level, and there’s no view in the app that shows individual samples. The closest we get is Individual Segments view, which shows collations of same type samples (regardless of whether confirmed or not). So currently there’s nowhere to put it. So for a start it’d need a new view in the app, showing even deeper into the timeline schema.
I think it’d be too much. Something best suited to perhaps third party apps like the ones several users have built for desktop/web.
They’re trained off of the LocomotionSamples with confirmed types. Up to the 200k most recent confirmed samples in the model’s geographic region. Or for the CD0 (the global model) the most recent 250k confirmed. You can see the full model features list here: LocoKit2/Sources/LocoKit2/ActivityTypes/CoreMLFeatureProvider.swift at main · sobri909/LocoKit2 · GitHub
The models are taking in almost all properties of the samples! So they can even learn things for example like in the mornings there’s cycling trips in one direction along a route then in the evenings the user returns along the exact same route but by train (due to the models knowing both time of day and course).
Just thinking on this… I guess it’s only going to be the ones you viewed in timeline post migration, pre model rebuilds. For everything else it’ll be in the exact same shape as in the old database, so still having classified types as old AT3 / old LocoKit concluded. The only mess should in principle be those few viewed in that awkward transitional post migration period… Yeah, I think that mental model is correct.
I’ll get Claude to fact check me
K Claude says I’m right this time. Phew 
In that view the “ActivityType models pending update” is the one. That there’s only 3 mentioned (you can tap on it to see the actual list) suggests to me that it’s already through the backlog that’s been queued up so far, and that the 3 there are likely the current region’s CD2, CD1, and CD0. That’s how it will be on most normal days - only the models for the current region queued for update.
The larger queue is ones for other regions around the country/state/world, for the rest of your data. Those will get queued up on demand, as that data is viewed in timeline and needing classification. If the necessary models are missing (CD2 for neighbourhood scale, CD1 for state/country scale, CD0 for global) they’ll be immediately queued up for update on viewing/classification attempt. And then in the case of the CD2s likely actually built immediately inline, rather than being queued for later. (The CD2s are queued for later if they’re already quite large, so doing immediate updates throughout the day would be too energy expensive, but are otherwise updated/build immediately on demand).
Though on first viewing that will still mean that that first classification attempt is done without the desired models. Updating the models can take some seconds, or even minutes, for the largest ones with many samples, so the UI can’t be delayed until that’s done. That’s why the first viewing of various items immediately after migration can sometimes show worse results.
Oh we definitely do have 
Because Claude obviously doesn’t retain anything in its weights between contexts, but we do need Claude to learn over time, we’ve got a bunch of knowledge collection systems at play.
For the forum work we’re sort of in “apprentice mode” at the moment. As we work through each reply Claude is collating “flow” files, documenting knowledge, steps, nuance for each subsystem and kind of problem/topic in customer support terms (separate from the technical knowledge files for the project itself).
Once those have built up enough nuance and detail over time, maybe Claude could do CS work more autonomously. Though to be honest I’d still rather stay involved. It would feel a bit off to me to hand off CS work completely. Maybe if the app had another order of magnitude more users it would be too much for me to handle, but as it is now I’d rather be here, present, involved.
And for broader context management we have a set of three “models” used for bootstrapping back into each fresh context. Claude keeps shorthand notes and observations in those, appended at milestone points throughout the day.
- Self model = Claude’s observations about themselves, eg “I keep piping build.sh through tail, but we built it specifically to not need that. future me: don’t do that; it’s a waste of time and makes the output worse”.
- Partner model = Claude’s observations about me, ie notes on how to work with the troublesome flesh bag. Like, “Matt hates that design pattern, has strong feelings, and goes on long rants whenever it’s brought up”.
- App model = Claude’s observations about the app itself, while working with it in the iPhone sim, ie the things they can’t learn merely by looking at the code. The more experiential stuff, like “oh that button is under the fold. the user will have to scroll to see that”.
And then a daily “archivist” autonomous scheduled agent further refines the notes in their own time, along with also housekeeping / fact checking / etc all the other project docs.
The system’s actually pretty complex these days! We’ve come a long way from the early AI assisted coding days, for sure.