JSON Import Format

Is the ARC JSON format documented anywhere?

I’m wondering if it’ll be possible to build external transformers that can take other data formats and convert them in the ARC format. Then it would be possible for the community to build its own transformers for data sources like Google Timeline or GPX files.

3 Likes

That would be great!

The format isn’t formally documented anywhere, but the code behind it is available on Github in the LocoKit repo, in the various TimelineObject classes (samples and timeline items). I can also answer any questions anyone has here!

Reading through the code, am I right in saying you would need to generate locomotion samples to create a travel path?

The timeline objects seem simple enough, but I’m not sure the best way to create locomotion samples, as most data won’t include 90% of the info normally in a locomotion sample. Additionally would creating sparse locomotion sample throw of the ML training? I suspect data taken from 3rd party sources wouldn’t be good enough for Arc’s ML training process.

[EDIT] looking through ArcMini it seems that only samples and timeline entries with the source LocoKit are used to update models, everything else is ignored. So I assume this means I can create sparse samples with a made up source, and they won’t end up polluting Arc’s training process?

Yep. LocomotionSamples contain the lat/long coordinates of samples recorded every 2 to 60 seconds.

Yeah, the ML will largely skip over those samples, judging them not suitable for learning from. It might be able to classify them though, purely based on coordinates, but the results will be less than ideal. If the original data has known activity types, then you’d want to put that type in the confirmedType field for each sample, so that the classifier doesn’t bother trying to make sense of them.

1 Like

I’ve managed to successfully massage a single GPX file into ArcJSON and import it. Of the back of this I’ve got a couple of questions:

  1. Can you add the ability to set the source in imports? At the moment everything gets imported with the source LocoKit, which means that Arc will try to use the data for training the ML stuff. Ideally it wouldn’t do this, because the data quality is so low.

  2. Is there a way to import timeline items with no next or prev timeline ID and have Arc figure that out? At the moment it’s difficult to attach new timelines items to the existing timeline in Arc.

  3. Can you expose a more basic location attribute for imports that isn’t based on CLLocation object? I don’t have enough data for course, or lat/lon accuracy. It would be nice to omit this values instead of adding filler values.

1 Like

This is something that can be / probably should be done in the Arc Mini code on Github. Though I don’t think the new file import system is in the Arc Mini code yet. I’ll get on that.

Yeah, it’s actually better to not give those fields a value at all. I have a hunch the importer actually always ignores them anyway, because the database will reject incorrect / impossible linked lists. So it imports them with no next/previous links, then lazily rebuilds those links based on nearest neighbour, after import.

Hmm. I thought it’d happily accept it as long as lat/long exists, but looks like I’ve remembered wrong. Could modify that so that, if full CodableLocation fails, it falls back to just using lat/long to initialise a basic CLLocation.

This would be amazing, would help me understand the import process and how to create valid import data.

If the way in the current Arc import process to bulk import timeline items? I can’t seem to import a timelineItems.json file with multiple timeline items in it, even if I used the export schema, and there’s no “Import All” button for timeline items in the UI, forcing me to tap each item manually, which get tedious pretty fast.

I’ve been struggling with this quite a bit. The only way I’ve managed to get imported data to appear in Arc is to create a valid linked timeline series, import it, and then import an update for an existing timeline item to set its previous timeline item to the imported data. I’m not sure if Arc is relinking in the background or not, there’s no UI to show the progress, it just looks like the import failed.

It would be nice if the import UI showed the progress on the re-linking. Looking through Arc Mini, it looks like timelines items should be relinked if they’re broken when you look at the correct day. But I’ve failed to induce this behaviour, additionally the data I want to import is from years back, so tap “back” over 600 times doesn’t really work for me.

That would be super nice. Would also be cool is Locomotion samples could be marked as dirty or something, then have those sample excluded from ML updates.

The trick is to import the LocomotionSamples first, and the importer will find the relevant TimelineItem files itself. When importing a sample, it looks at the timelineItemId, checks the db to see if it has that item, and if not, looks for a matching JSON file to import along with the sample(s).

Likewise Places - if a TimelineItem file references a placeId, it’ll check the db for existing, then check for a matching JSON file. So importing a LocomotionSamples file will also trigger the import of all referenced TimelineItem and Place files.

Hm, that sounds weird. I wonder if this due to importing TimelineItem files before samples? TimelineItems with no associated samples won’t show up in the UI, due to being empty / zero duration. (They’ll also get auto deleted later, during some cleanup process, if they still have no associated samples).

Starting the import from LocomotionSamples file first might get around that weirdness.

Ok so I’ve dumped into Arc Mini all the importer and related views from Arc App.

The code is literally just dumped in without edit at this stage, so it won’t compile, because it references a bunch of Arc App specific little helpers and whatnots. (I’ve also removed them from the project target, to avoid them breaking the build).

ArcImporter is the brains. It’s just an ugly “manager” dumping ground of all file importing functions. Not exactly elegant architecture, but it got the job done :joy:

ImportList is the manual importer view that you’ve been using.

RestoreView is the managed importer that’s available from fresh installs, and is literally just a stripped down version of the manual importer, that “pushes all the buttons” itself. So that one’s the least useful to sift through.

ImportTask is the wrapper / state machine around each file being imported, that ArcImporter uses, and ImportList uses for knowing what to show in the UI.

ErrorLogView is an unfinished/unused all-in-one view for showing all errors from a managed restore (ie RestoreView). It’s not used by the manual importer, which already lets you see errors individually per import task.

When it comes time to hack this stuff up to deal gracefully with different file formats, I think that big ugly single file ArcImporter will need to be split up.

It’s all working on the assumption it can just take a JSON file and directly decode it to the appropriate TimelineObject type (array of LocomotionSamples, a TimelineItem, Place, Note, etc). Which won’t be possible with other file formats - they’ll need an intermediary layer.

Which I do actually kinda have, in the very old Moves App data importer. But that was hacked up in a panic in a single weekend, after Facebook announced out of the blue that they were closing Moves down and my inbox filled up with 10,000 emails. It’s elegant work for a panicked weekend, but it’s not elegant enough to be reusable.

Actually I should still drop it into the Arc Mini repo. I’ll have a sift over it now, and see what it looks like…

Ahhhh! That’s what in was missing. I assume I need to name the timeline item files correctly? (Don’t answer that, I’ll dig through the import code!).

I’ll need to do a little more investigation on this. My description of the behaviour isn’t quite accurate. But without the import code it’s difficult to figure out why imports were failing. I’ll compare what I’m generating to what’s expected and see if I can figure out what’s breaking.

Oh perfect, this is gonna be so helpful!

Just as a heads up, I would love to dive into this and contribute back to the project. But it’s likely my employment contract prevents me contributing code directly :frowning:, I’ll double check, but I suspect I’m limited to giving pointers on bugs.

You’ve probably already figured it out, but yep. The importer looks for them by filename, to keep the process efficient. That’s also why there’s only one object per file, for TimelineItems and Places.

No worries. Such is life :wink:

Yeah figured that one out. It was the capitalisation that was throwing me off. I was generating UUIDs in lowercase and using that in the JSON and file names. Turns out the import code is parsing the UUID, then converting it back to string for the file lookup, but Swift uses uppercase UUID strings rather than lowercase (which Python does by default), so Arc wasn’t finding my files.

Pretty easy to fix on my part. But would be nice if the file search code could be made case insensitive.

Also one other thought, would be nice if the import code could import from a compressed file. iCloud is really slow at syncing thousands of small files, and the whole downloading stuff on iOS is a bit dodgy. Being able to create a single compressed file with right internal structure would make it much easier to push data over iCloud.

Am I right in saying the current import code doesn’t support the Arc JSON export format? To import Arc JSON, I would need to split the JSON into a LocomotionSample/samples.json file and a bunch of TimelineItem/<UUID>.json files?

Argh. That’d be the most annoying mystery bug to find :joy:

Agree.

Oh it doesn’t?? Wait, yes, the LocomotionSample files are gzipped. Ah, but you mean zipping up a bunch of single object files, like TimelineItem files, into a tar.gz or similar?

Yeah, that’d make sense. iCloud syncing has been one the biggest headaches of testing and debugging the backup system.

Yep :disappointed: To my shame.

I was writing all this new backup/restore system during Covid lockdowns, and got super burnt out on it (along with everything else in this currently very dull world). It was a struggle to even get it shipped in its current state. So even though “support Arc JSON” was literally a core initial requirement, I didn’t get to it before burning out completely.

Theoretically it should be trivial to make it happen. I mean, Arc JSON is literally just a serialised TimelineSegment, that could be decoded straight back into a tree of TimelineObjects. But yeah, need to muster up the energy to actually get it done.

Here a little bit of Python that takes an Arc JSON export and converts it into the separate TimelineItem LocomotionSample and Place files.

Anyone running this will need to create the three output folders (TimelineItem LocomotionSample Place) before they run the script on their data.

import json

with open('<input file>.json') as file:
    data = json.load(file)

places = {}
timelineItem = {}
samples = []

for item in data['timelineItems']:
    if item.get('place'):
        places[item['place']['placeId']] = item['place']

    if item.get('samples'):
        samples += item['samples']
        del item['samples']

    timelineItem[item['itemId']] = item

with open('LocomotionSample/samples.json', 'w') as file:
    json.dump(samples, file, indent=2)

for id, item in timelineItem.items():
    with open(f'TimelineItem/{id}.json', 'w') as file:
        json.dump(item, file)

for id, item in places.items():
    with open(f'Place/{id}.json', 'w') as file:
        json.dump(item, file)
1 Like

That’s brilliant! Thanks so much for this :smile:

Also goes to show how much more concise and efficient Python can be.

I’ve also got one that will take Arc exported GPX files and produce mostly ok JSON for import. I’ll throw that up here once I’ve given it a bit more of a clean.

1 Like

Thanks for the script.

Once before I had temporarily reverted to an older iOS version on another device due to some issues. Because of such, and various circumstances, my current Arc app data diverged from before with years of data. I have that backup and am ready to try a merging of such.

Any updates or precautions? From what I can tell, I can take the older backup with necessary folders, run this script, and then place the transformed files in the Import folder?