update notes

2020-05-26 10:30:30 +02:00 · 2020-05-26 10:30:30 +02:00 · 464615ab21
parent 209aa77a2e
commit 464615ab21
2 changed files with 142 additions and 1 deletions
--- a/doc/impl-thoughts/E2EE.md
+++ b/doc/impl-thoughts/E2EE.md
@ -108,7 +108,7 @@
 
 ## SendQueue

-we'll need to pass an implementation of EventSender or something to SendQueue that does the actual requests to send a message, one implementation for non-e2ee rooms (upload attachment, send event OR redact, ...) and one for e2ee rooms that send the olm keys, etc ... encrypts the message before sending, reusing as much logic as possible. this will entail multiple sendScheduler.request slots, as we should only do one request per slot, making sure if we'd restart that steps completed in sending are stored so we don't run them again (advancing olm key, ...) or they are safe to rerun. The `E2eeEventSender` or so would then also be the thing that has a dependency on the memberlist for device tracking, which keeps the dependency tree clean (e.g. no setMembers on a class that does both e2ee and non-e2ee)
+we'll need to pass an implementation of EventSender or something to SendQueue that does the actual requests to send a message, one implementation for non-e2ee rooms (upload attachment, send event OR redact, ...) and one for e2ee rooms that send the olm keys, etc ... encrypts the message before sending, reusing as much logic as possible. this will entail multiple sendScheduler.request slots, as we should only do one request per slot, making sure if we'd restart that steps completed in sending are stored so we don't run them again (advancing olm key, ...) or they are safe to rerun. The `E2eeEventSender` or so would then also be the thing that has a dependency on the memberlist for device tracking, which keeps the dependency tree clean (e.g. no setMembers on a class that does both e2ee and non-e2ee). We would also need to be able to encrypt non-megolm events with Olm, like 4S gossiping, etc ...

 ## Verifying devices
 - validate fingerprint
@ -119,3 +119,140 @@ we'll need to pass an implementation of EventSender or something to SendQueue th

 ## Notes
  - libolm api docs (also for js api) would be great
+
+
+## OO Design
+
+e2ee/MemberList
+    // changes user tracking and returns changed members
+    // this probably needs to be run after updates to the rooms have been written 
+    // to the txn so that if encryption was enabled in the same sync,
+    // or maybe not as we probably don't get device updates for a room we just joined/enabled encryption in.
+
+    async writeSync(txn)
+    emitSync(changes)
+    async addRoom(roomId, userIds, txn)
+    async addMember(roomId, userId, txn)
+    async removeMember(roomId, userId, txn)
+
+    async getMember(userId, txn)
+
+    // where would we use this? to encrypt?
+    // - does that need to be observable? well, at least updatable
+    // - to derive room trust from ... but how will this work with central emit point for room updates?
+        // check observablevalue trust before and after sync and detect change ourselves?
+        // set flag on room when observablevalue trust value emitted update and then reemit in emitSync?
+        // ALSO, we need to show trust for all rooms but don't want to have to load all EncryptionUsers and their devices for all e2ee rooms.
+        // can we build trust incrementally?
+            // trusted + new unverified device = untrusted
+            // trusted + some device got verified = ?? //needs a full recheck, but could be ok to do this after verification / cross signing by other party
+            // trusted + some device/user got unverified = untrusted (not supported yet, but should be possible)
+            // so sounds possible, but depends on how/if we can build decryption without needing all members
+
+    async openMembersForRoom(roomId) : ObservableMap<userId, EncryptionUser>`
+
+    // can we easily prevent redundancy between e2ee rooms that share the same member?
+
+e2ee/EncryptionUser
+    get trackingStatus()
+    get roomIds()
+
+    // how to represent we only keep these in memory for e2ee rooms?
+    // for non-e2ee we would need to load them from storage, so needing an async method,
+    // but for e2ee we probably don't want to await a Promise.resolve for every member when encrypting, decrypting, ... ? or would it be that bad?
+    // should we index by sender key here and assume Device is only used for e2ee? Sounds reasonable ...
+    `get devices() : ObservableMap<senderKey, Device>`
+
+    would be nice if we could expose the devices of a member as an observable list on the member
+    at the same time, we need to know if any member needs updating devices before sending a message... but this state would actually be kept on the member, so that works.
+
+    we do need to aggregate all the trust in a room though for shields... how would trust be added to this?
+
+    ```js
+    // do we need the map here?
+    const roomTrust = memberList.members.map(m => m.trust).reduce((minTrust, trust) => {
+        if (!minTrust || minTrust.compare(trust) < 0) {
+            return trust;
+        }
+        return minTrust;
+    });
+    ```
+
+e2ee/Device
+    // the session we should use to encrypt with, or null if none exists
+    get outboundSession()
+    // should this be on device or something more specific to crypto? Although Device is specific to crypto ...
+    createOutboundSession()
+
+    // gets the matching session, or creates one if needed/allowed
+    async getInboundSessionForMessage()
+
+
+e2ee/olm/OutboundSession
+    encrypt(type, content, txn) (same txn should be used that will add the message to pendingEvents, here used to advance ratchet)
+
+
+e2ee/olm/InboundSession
+    decrypt(payload, txn)
+
+e2ee/olm/Account
+    // for everything in crypto, we should have a method to persist the intent
+    createOTKs(txn)
+    // ... an another one to upload it, persisting that we have in fact uploaded it
+    uploadOTKs(txn)
+
+DeviceList
+    writeSync(txn)
+    emitSync()
+    queryPending()
+
+
+actually, we need two member stores:
+    - (Member) one for members per room with userid, avatar url, display name, power level, ... (most recent message timestamp)?
+    - (EncryptionUser) one per unique member id for e2ee, with device tracking status, and e2ee rooms the member is in? If we duplicate this over every room the user is in, we complicate device tracking.
+
+the e2ee rooms an EncryptionUser is in needs to be notified of device (tracking) changes to update its trust shield. The fact that the device list is outdated or that we will need to create a new olm session when sending a message should not emit an event.
+
+requirements:
+ - Members need to be able to exists without EncryptionUser
+ - Members need to be able to map to an EncryptionUser (by userId) 
+ - Member needs to have trust property derived from their EncryptionUser, with updates triggered somehow in central point, e.g. by Room.emitSync
+    - also, how far do we want to take this central update point thing? I guess any update that will cascade in a room (summary) update ... so here adding a device would cascade into the room trust changing, which we want to emit from Room.emitSync.
+    - hmm, I wonder if it makes sense to do this over member, or rather expose a second ObservableMap on the room for EncryptionUser where we can monitor trust
+        - PROs separate observablemap:
+            - don't need to load member list to display shields of user in timeline ... this might be fine though as e2ee rooms tend to be smaller rooms, and this is only for the room that is open.
+        - CONs separate observablemap:
+            - more clunky api, would need a join operator on ObservableMap to join the trust and Member into one ObservableMap to be able to display member list.
+ - See if it is doable to sync e2ee rooms without having all their encryptionUsers and devices in memory:
+     - Be able to decrypt *without* having all EncryptionUsers of a room and their devices in memory, but rather use indices on storage to load just what we need. Finding a matching inbound olm session is something we need to think how to do best. We'll need to see this one.
+     - Be able to encrypt may require having all EncryptionUsers of a room and their devices in memory, as we typically only send in the room we are looking at (but not always, so forwarding an event, etc... might then require to "load" some of the machinery for that room, but that should be fine)
+     - Be able to send EncryptionUser updates *without* having all EncryptionUsers and their devices in memory
+
+other:
+ - when populating EncryptionUsers for an e2ee room, we'll also populate Members as we need to fetch them from the server anyway.
+ - Members can also be populated for other reasons (showing member list in non-e2ee room)
+
+
+we should adjust the session store to become a key value store rather than just a single value, we can split up in:
+    - syncData (filterId, syncToken, syncCount)
+    - serverConfig (/versions response)
+    - serialized olm account
+so we don't have write all of that on every sync to update the sync token
+
+new stores:
+
+room-members
+e2ee-users
+e2ee-devices
+inbound-olm-sessions
+outbound-olm-sessions
+//tdb:
+inbound-megolm-sessions
+outbound-megolm-sessions
+
+we should create constants with sets of store names that are needed for certain use cases, like write timeline will require [timeline, fragments, inbound-megolm-sessions] which we can reuse when filling the gap, writing timeline sync, ...
+
+
+main things to figure out:
+    - how to decrypt? what indices do we need? is it reasonable to do this without having all EncryptionUser/devices in memory?
+        - big part of this is how we can find the matching olm session for an incoming event/create a new olm session
--- a/doc/impl-thoughts/RECONNECTING.md
+++ b/doc/impl-thoughts/RECONNECTING.md
@ -77,3 +77,7 @@ rooms should report how many messages they have queued up, and each time they se
 thought: do we want to retry a request a couple of times when we can't reach the server before handing it over to the reconnector? Not that some requests may succeed while others may fail, like when matrix.org is really slow, some requests may timeout and others may not. Although starting a service like sync while it is still succeeding should be mostly fine. Perhaps we can pass a canRetry flag to the HomeServerApi that if we get a ConnectionError, we will retry. Only when the flag is not set, we'd call the Reconnector. The downside of this is that if 2 parts are doing requests, 1 retries and 1 does not, and the both requests fail, the other part of the code would still be retrying when the reconnector already kicked in. The HomeServerApi should perhaps tell the retryer if it should give up if a non-retrying request already caused the reconnector to kick in?

 CatchupSync should also use timeout 0, in case there is nothing to report we spend 30s with a catchup spinner. Riot-web sync also says something about using a 0 timeout until there are no more to_device messages as they are queued up by the server and not all returned at once if there are a lot? This is needed for crypto to be aware of all to_device messages.
+
+We should have a persisted observable value on Sync `syncCount` that just increments with every sync. This way would have other parts of the app, like account data, observe this and take action if something hasn't synced down within a number of syncs. E.g. account data could assume local changes that got sent to the server got subsequently overwritten by another client if the remote echo didn't arrive within 5 syncs, and we could attempt conflict resolution or give up. We could also show a warning that there is a problem with the server if our own messages don't come down the server in x syncs. We'd need to store the current syncCount with pieces of pending data like account data and pendingEvents.
+
+Are overflows of this number a problem to take into account? Don't think so, because Number.MAX_SAFE_INTEGER is 9007199254740991, so if you sync on average once a second (which you won't, as you're offline often) it would take Number.MAX_SAFE_INTEGER/(3600*24*365) = 285616414.72415626 years to overflow.