Stuck on "thinking", then crash

Oh, wow. Thanks for the rundown on the options, it’s pretty much what I’m doing.
I’m importing in the background year by year, I’m mostly at home or at the desk at the office these days so I can leave my phone charging so hopefully it won’t fuck up my battery.
I’ll check on another phone I have how does it look like after importing without the timeline items. If it’s good, I might just do that, but I’m losing a significant amount of data with this action, did I understand correctly?

But unfortunately, this whole thing probably means I’ll have to somehow transition to another system/app at least for my historic data, because the situation is just beyond recovery, if I understood your post. Either I lose a lot of data, or I have a ticking bomb ready to explode constantly. Hopefully I’ll find something.

Yep. The TimelineItem files contain things like the Place assignment, so you’d lose all your manually confirmed/correct place assignments. Which for people with bad memories like me is a pretty big deal. And given years of data, it’s probably a pretty big deal for almost everyone.

Not sure I understand what you’re saying there. You can definitely get all your data back in, and eventually cleaned up. There’s nothing stopping that.

By ticking bomb do you mean the risk of database corruption again? Database corruption is extremely rare, and to be hit by it presumably twice is extremely bad luck. I wouldn’t expect a third time unless there’s something physically wrong with the phone.

I would esimate that database corruption is perhaps a 1 in 10,000 event, happening to 0.01% of users.

The problems are on many layers, Matt.
The first one is that my historical data was clearly not the same as the one I left. I had lots of unconfirmed items and clearly mistagged activities in years where I was confirming stuff day by day. So somehow, your approach to categorization went back and damaged my historic data.
so even if I get my data in, it is still is not as clean as I left it. I need a more stable system that will not touch past data if not specifically prompted to do so.

The other thing is that you said it yourself, I have 180k elements where I should have some 25k, so I have a monster of a situation that is very hard to manage with icloud, with the device, with the backups. Anything happens and I’m back at doing this. I’ve done it twice and I’m telling you, I will not do it another time. Honestly I’m not even sure why I’m doing it again now.

And on the database corruption, I’ll point you at “The Black Swan” by Nassim Taleb. Reading that sentence about the chances is not a very good sign, honestly.

What you will be seeing with reimported data will be the classifier models not yet rebuilt. The models aren’t in the backups, so after the data is reimported the model updates need to run again to see the newly imported data and update to recognise it. Then when you view that imported data again, it will look cleaner and similar to or the same as it was previously.

Anything that was previously confirmed stays confirmed forever, both in the database and in the backups. Certain “manual” properties are set on those samples and items, and nothing can then remove those values (other than user action to change them in Edit views). So anything previously confirmed and then backed up will come back in that same state when restored or reimported. It’s only the data that wasn’t previously confirmed that the classifiers are able to reevaluate.

Man, I don’t know how to tell you this
I clearly saw data from days from a period of time when I confirmed everything not being confirmed anymore. And movement types being clearly wrong. This was BEFORE I nuked everything, so God know what I will find now.

And by the way you can see this by yourself by going in the past, you will find unconfirmed items, confirm them and then there will be items to confirm again. This means stuff isn’t getting saved or some shenanigans happening.

I don’t want this to become combative, but I can tell you with certainty that either that is not happening and you are mistaken, or your database was corrupted to the point where data was unable to correctly save.

This app has been my entire life for the past 8-9 years. I live and breath it. I dream it. I barely spend a single minute of my life where I’m not thinking about it. If it were possible for manual edits to be reverted by the app, I would have either seen it myself or seen incontrovertible evidence of it from someone else. Neither of those have ever happened, and nowhere in the code is anything that is capable of doing that.

What does however happen all the time is people mistakenly believing they are seeing unconfirmed items that they previously confirmed. But those items were not previously confirmed. They were items that the classifiers at the time concluded didn’t need confirmation.

It is an illusion created by the classifiers having high confidence at one point in time then low confidence at a later point in time. I creates the experience of “I cleaned all of this up at the time!” when in reality the classifier just didn’t ask you to do any cleanup on those items at the time.

Yeah, that makes exactly sense. If the UI tells me a day is all confirmed and then 4 years later the model changes and then as such past data is re evaluated, suddenly stuff that I thought was done is not done anymore.
If there is no UI to tell me if some of the data is automatically classified or manually confirmed, I assume everything is confirmed.
This is not the case, and more importantly, there is no good way to mark a day as “correct” so that all data is guaranteed to be read only forever.

While you breathe the app, your users don’t.
I don’t want “a long term project” to recover backups if stuff goes wrong.
I don’t want to have to constantly check back on 8 year old data to see if the classifier changed idea on the type of activities I made.
I’d like this app to require somewhat less babysitting, if that was possible!

There is a feature request for this, though the idea isn’t well fleshed out yet, and not a lot of demand for it yet. I think it’s just been some scattered thoughts here on the forum and amongst the beta testers team. It’d need more interest to get it over the line.

I recommend trying some of the other similar apps out there. You won’t get the level of detail or accuracy that Arc provides, but you might be happier with something simpler. In a sense, recording less data and less detail could be considered a feature, in that the more data there is the larger the task to manage and maintain it.

Arc’s goal is to push the hardware and current technologies as far as they can go, to aim for the greatest detail and greatest accuracy. That does come with some costs, and occasionally some risks. I work as hard as possible to mitigate those risks, and reduce those costs, but Arc’s goal isn’t to be the simplest or easiest or least time demanding. Those are goals for other apps.

Situation is that I got up to 2021 (I started the process from 2013 and going forwards before your telling me to go the other way, unfortunately) and I’m guessing that’s where the actual 180k timeline items are as the import has halted to a complete crawl. Like every single item takes about two to three days to import using MASSIVE amounts of energy (battery drops like hell), I had to stop doing it.
I wanted to finish the import using the technique you wrote:

But it just errors and does not import anything. How do I do this?

Is there any way you could make a Python script or something like that to actually fix this backup on my computer? Something that goes through the timelineitems and checks if they can be merged or something? I literally can’t import 3 years of data at this point. Gone.

By the way, I tried doing the GPX import. Unfortunately you didn’t put any batch option so being that I don’t have the time to import thousands of days by hand, I tried using the monthly gpx Arc generated, and… nope

  • You put the import button at the end of the list, so I have to scroll past thousands of elements. Not fun, and often crashes.

  • If I press the select all button and then start scrolling to get to the import button, the app crashes 100% of the times

  • if I first scroll to the import all button and the app doesn’t crash, then I select all then I start the import… well the import starts but it crashes halfway through or less.

I had high hopes of rebooting everything with the gpx data but nope.

When you try to import a samples week file it’ll at first error due to the missing dependent files. But then if you tap on the error exclamation mark it’ll pop up a view that offers a button to retry the import ignoring missing dependents.

I think what I would try in that case, if Python scripting is an option, would be to read the sample files and collate a list of timelineItemIds that are referenced. That would then give a list of which TimelineItem files need to be kept, while the rest could be discarded.

Although that’s assuming that the massive number of extra TimelineItem files are items that are no longer referenced. If instead the db had broken in such a way that for every new sample created a new timeline item was being created… that wouldn’t work.

Unfortunately I don’t have a good guess either way in that case. Typically if there’s excess TimelineItem files they’re files for items that have since been deleted from the database (or no longer referenced) but for whatever reason weren’t also deleted from the backups. But when there’s db corruption all bets are off.

Hm. Well that shouldn’t happen! I tested the importer with some massive GPX files from the beta team, and optimised it for the largest of them. But I guess a monthly export from Arc is even larger. I’ll do some testing on one of those too.

Unfortunately an import from GPX will be fairly poor anyway, due to the GPX format not supporting a lot of the extra information that Arc records/uses. Accelerometer data isn’t present, for example, so the classifiers can’t do their job with the imported data. And Place assignments are simplified down to simply the text of the place’s name, used as a custom title. So 10 visits to the same Starbucks would each be imported as a visit with a custom title of “Starbucks” instead of pointing to the single correct store.

The ability to import from ArcJSON would solve those problems. ArcJSON is the JSON format that Arc’s manual JSON exports use, which differs somewhat from the JSON used in the backups. It’s not currently supported for importing, because there’s been very little interest in it.

I think given the current situation you’re experiencing, I’d go with trying to do the “import ignoring missing dependents” approach, and see how that goes for a test samples week file. Then reassess from there.

Unfortunately, I’ve seen that reimport ignoring the missing dependents only once. All other samples do not show that option. Also this means I’m in for hundreds of manual taps, the fun just keeps on coming.

GPX has support for extensions for every single xml field by the way, I’d put whatever you want to put in there.

As for the python script, I do not think the items are unreferenced from what I can see from the jsons, but as you said probably one per data sample. What I mean is, do you think you can find a way to aggregate them using the same logic locokit uses to define when you should start a new timelineitem?

It will show the button as long as at least one sample in the file was skipped due to its referenced TimelineItem file being missing.

True, but Arc’s GPX exports were never intended to be used for reimporting back into Arc. So the only point in adding extensions for things like accelerometer data would be if some other GPX consuming app were going to make use of them. None of the existing GPX extensions support accelerometer data fields, so there’s little chance of there being any GPX consumers interested in those extras.

ArcJSON is the export format intended for encapsulating all the information that Arc records and stores.

Hmm. That sounds like a lot of work. Or… perhaps not.

When LocoKit records the initial samples it does real time moving/stationary state detection, which is used to form the initial Visits / Paths (the TimelineItem subclasses), simply based on whether moving or stationary. That moving/stationary state detection is only dependent on location data, nothing else, so it shouldn’t have been impacted by any database corruption, and should still be usable for forming basic Visit / Path distinctions.

You can see that logic here in the TimelineRecorder code. You wouldn’t be able to do the activity type classification that it uses in the moving->moving case below the mode-change-allowed threshold. But the basics of “put contiguous moving samples together, and contiguous stationary samples together” wouldn’t be too tricky.

I imagine the most fiddly part would be in reusing existing TimelineItem files. Though I guess if there’s a moving/stationary change boundary and thus need for a new TimelineItem you could just take the existing TimelineItem referenced by the boundary sample and edit that item file as appropriate (set its isVisit bool as appropriate).

The startDate and endDate fields in TimelineItem files are optional, and not actually used for the import process, so it’s fine to leave those either wrong or missing. Once imported, an item will update its date range based on the samples assigned to it. So those values in the JSON files are just there as convenience for other JSON consumers.

It might be worth having a look at these two projects:


This is what I get. I’m at wits end. (all of them show the exclamation mark, only some of them actually show the missing items, but all of them have the errored samples. Is this good? Is this bad?)

Man, what I’m trying to say is… your app made the whopsie, I paid hundreds of euros for it over the years, I’m really not into making this myself. I hoped you would actually try to fix this gigantic mess. I’ve seen you blame Apple for this and that and for almost every problem. Now this is clearly on you, and basically what you told me is to either lose data (and not even that works, turns out) or spend an inordinate amount of time and energy and probably battery life to go around bug after bug I encounter. And not even that guaranteed full recovery, as I clearly have a non correct situation and you offered exactly ZERO to actually solve the problem.
I’ve offered you to contact me to go through the db and find the culprit, to run some scripts on my computer to try to fix the data, to try to see what’s going on. You show no interest.

Why do you piss on your customers like this? This is 10 years of data we are talking about.

Hmm. Must be some other error happening then. I’ll dig into the code in a sec and see what the possibilities are.

What do you suggest I do? If there were an easy (or even difficult) way to fix this kind of problem, I would be doing it.

Which I do when that is the case.

I have two other apps currently, each of which I built in less than a month, that require effectively no support or maintenance, and run on both iPhone and iPad. Most apps are incredibly simple and easy to build and maintain. Arc is not. It is not a normal app.

Arc is the absolute worst case end of the spectrum. Arc is effectively constantly at battle with iOS and what Apple want apps to do on iOS. It’s also at battle with the limits of current hardware and technology, in that it records more data than any of Apple’s own apps (with the exception of Photos app), and also needs to process that mountain of data, back it up, export it, import it, analyse it, collate it, and more, all while staying within iOS’s energy use and CPU use limits each day.

Some of Arc’s problems are indeed on me. I don’t have enough time or energy to stay on top of every little (or even big) thing. But many, if not most of the problems are due to restrictions put in place by Apple, either documented or not. When I blame Apple for something, it’s because it’s their doing.

I am one person. There is no support team, it’s just me. In order to keep the business (and myself) alive I have to not just support existing users but also build new features (subscriptions drop off if there isn’t a major update every month). I also need to constantly maintain the app to keep it working properly after each new iOS update, for new (and old) phone models.

I simply don’t have time to do everything that needs to be done. And hand editing the databases of individual users is not something I can add to the already too long list.

I put in about an hour every day on the support forum, giving as detailed and thoughtful assistance as I can, including on weekends. I work seven days. I’m certainly not pissing on anyone. I’m doing the best I can.

Look, if you think it’ll help, upload your SQLite file somewhere for me to download, and I’ll see if I can spot any clues in it. I’m not confident I will be able to see any, but if that’s what you want me to do, I’ll give it a try.