Commit Graph

108 Commits (d72d634391b937810c23c1c5cd9724ecfd96caf0)

Author SHA1 Message Date
Neil Alexander 192a7a7923
Roomserver input backpressure metric
Squashed commit of the following:

commit 56e934ac0aeedcfb2c072010959ba49734d4e0cb
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:39:30 2021 +0100

    Fix metric

commit 3911f3a0c17b164b012e881c085ceca30f5de408
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:36:29 2021 +0100

    Register correct metric

commit a9ddbfaed421538a701151801e9451198a8be4f3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:33:33 2021 +0100

    Try to capture RS input backpressure metric
2021-07-02 09:48:55 +01:00
Neil Alexander 7c3991ee2f
Use a custom FIFO queue for the RS input API (#1888)
* Use a FIFO queue instead of a channel to reduce backpressure

* Make sure someone wakes up

* Tweaks

* Add comments
2021-06-28 15:11:36 +01:00
Kegsay af41f6d454
Add Sentry support (#1803)
* Add Sentry support

* Use HTTP Sentry properly maybe

* Capture panics

* Log fed Sentry stuff correctly

* British english linter
2021-03-24 10:25:24 +00:00
Neil Alexander 01267a34b9
Fix nil pointer crash in QueryMembershipsForRoom 2021-03-17 13:58:04 +00:00
Kegsay 3c419be6af
roomserver: don't make_join with ourselves if clients ask us to (#1797)
* roomserver: don't make_join with ourselves if clients ask us to

* delete properly
2021-03-08 18:16:28 +00:00
Will Hunt 9557ccada4
Fix appsevice alias queries part 2 (#1684)
* Check membership of room

* Use QueryStateAfterEventsResponse

* Fix complexity

* Add field ShouldHitAppservice to GetRoomIDForAlias

* Hit appservice when trying to join a non-existent alias

* remove unused

* Changes that I made a long time ago

* Rename to appserviceJoinedAtEvent

* Check membership in GetMemberships

* Update QueryMembershipsForRoom

* Tweaks in client API

* Update appserviceJoinedAtEvent

* Comments

* Try QueryMembershipForUser instead

* Undo some changes to client API that shouldn't be needed

* More /event tweaks

* Refactor /event bit

* Go back to QueryMembershipsForRoom because appservices are hard

* Fix bugs in onMessage

* Add comments

* More logical naming, clean up a bit

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-03-03 17:00:31 +00:00
Will Hunt a2773922d2
Send events to appservice based on room membership (#1680)
* Check membership of room

* Use QueryStateAfterEventsResponse

* Fix complexity

* Changes that I made a long time ago

* Rename to appserviceJoinedAtEvent

* Check membership in GetMemberships

* Update QueryMembershipsForRoom

* Tweaks in client API

* Update appserviceJoinedAtEvent

* Comments

* Try QueryMembershipForUser instead

* Undo some changes to client API that shouldn't be needed

* More /event tweaks

* Refactor /event bit

* Go back to QueryMembershipsForRoom because appservices are hard

* Fix bugs in onMessage

* Add comments

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-03-03 16:27:44 +00:00
Neil Alexander d15836e260
Increase gocyclo complexity to 25 (and remove all but 2 golint directives related to it) (#1783) 2021-03-03 14:35:57 +00:00
Neil Alexander 5d74a1757f
Don't query for servers so often in /send (#1766)
* Look up servers less often, don't hit API for missing auth events unless there are actually missing auth events

* Remove ResolveConflictsAdhoc (since it is already in GMSL), other tweaks

* Update gomatrixserverlib to matrix-org/gomatrixserverlib#254

* Fix resolve-state

* Initialise t.servers on first use
2021-02-16 17:12:17 +00:00
Neil Alexander 02e6d89cc2
Fix crash in membership updater (#1753)
* Fix nil pointer exception in membership updater

* goimports
2021-02-06 11:49:18 +00:00
Neil Alexander de5f22a469
Remove redundant check (#1748) 2021-02-04 11:12:52 +00:00
Kegsay 6d1c6f29e0
Add m.room.create to invite stripped state (#1740)
MSC1772 needs this because the create event contains info on if
the room is a space or not. The create event itself isn't sensitive
so other people may find this useful too.
2021-01-29 11:36:26 +00:00
Matthew Hodgson 0571d395b5
Peeking over federation via MSC2444 (#1391)
* a very very WIP first cut of peeking via MSC2753.

doesn't yet compile or work.
needs to actually add the peeking block into the sync response.
checking in now before it gets any bigger, and to gather any initial feedback on the vague shape of it.

* make PeekingDeviceSet private

* add server_name param

* blind stab at adding a `peek` section to /sync

* make it build

* make it launch

* add peeking to getResponseWithPDUsForCompleteSync

* cancel any peeks when we join a room

* spell out how to runoutside of docker if you want speed

* fix SQL

* remove unnecessary txn for SelectPeeks

* fix s/join/peek/ cargocult fail

* HACK: Track goroutine IDs to determine when we write by the wrong thread

To use: set `DENDRITE_TRACE_SQL=1` then grep for `unsafe`

* Track partition offsets and only log unsafe for non-selects

* Put redactions in the writer goroutine

* Update filters on writer goroutine

* wrap peek storage in goid hack

* use exclusive writer, and MarkPeeksAsOld more efficiently

* don't log ascii in binary at sql trace...

* strip out empty roomd deltas

* re-add txn to SelectPeeks

* re-add accidentally deleted field

* reject peeks for non-worldreadable rooms

* move perform_peek

* fix package

* correctly refactor perform_peek

* WIP of implementing MSC2444

* typo

* Revert "Merge branch 'kegan/HACK-goid-sqlite-db-is-locked' into matthew/peeking"

This reverts commit 3cebd8dbfbccdf82b7930b7b6eda92095ca6ef41, reversing
changes made to ed4b3a58a7855acc43530693cc855b439edf9c7c.

* (almost) make it build

* clean up bad merge

* support SendEventWithState with optional event

* fix build & lint

* fix build & lint

* reinstate federated peeks in the roomserver (doh)

* fix sql thinko

* todo for authenticating state returned by /peek

* support returning current state from QueryStateAndAuthChain

* handle SS /peek

* reimplement SS /peek to prod the RS to tell the FS about the peek

* rename RemotePeeks as OutboundPeeks

* rename remote_peeks_table as outbound_peeks_table

* add perform_handle_remote_peek.go

* flesh out federation doc

* add inbound peeks table and hook it up

* rename ambiguous RemotePeek as InboundPeek

* rename FSAPI's PerformPeek as PerformOutboundPeek

* setup inbound peeks db correctly

* fix api.SendEventWithState with no event

* track latestevent on /peek

* go fmt

* document the peek send stream race better

* fix SendEventWithRewrite not to bail if handed a non-state event

* add fixme

* switch SS /peek to use SendEventWithRewrite

* fix comment

* use reverse topo ordering to find latest extrem

* support postgres for federated peeking

* go fmt

* back out bogus go.mod change

* Fix performOutboundPeekUsingServer

* Fix getAuthChain -> GetAuthChain

* Fix build issues

* Fix build again

* Fix getAuthChain -> GetAuthChain

* Don't repeat outbound peeks for the same room ID to the same servers

* Fix lint

* Don't omitempty to appease sytest

Co-authored-by: Kegan Dougal <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-01-22 14:55:08 +00:00
Neil Alexander 244ff0dccb
Don't create so many state snapshots when updating forward extremities (#1718)
* Light-weight checking of state changes when updating forward extremities

* Only do this for non-state events, since state events will always result in state change at extremities
2021-01-18 13:21:33 +00:00
Neil Alexander 3ac693c7a5
Add dendrite_roomserver_processroomevent_duration_millis to prometheus
Squashed commit of the following:

commit e5e2d793119733ecbcf9b85f966e018ab0318741
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Jan 13 17:28:12 2021 +0000

    Add dendrite_roomserver_processroomevent_duration_millis to prometheus
2021-01-13 17:31:46 +00:00
Neil Alexander e1e34b8994
Deep-checking forward extremities (#1698) 2021-01-11 12:47:25 +00:00
Neil Alexander fac71edc62
Fix #1655 by re-adding the appservice alias query (#1660) 2020-12-18 13:33:28 +00:00
Neil Alexander 2885eb0422
Don't use request context for input room event queued tasks (#1640) 2020-12-14 14:40:57 +00:00
Neil Alexander f5869daaab
Don't start more goroutines than needed on RS input, increase input worker buffer size (#1638) 2020-12-14 10:42:21 +00:00
Neil Alexander d9b3035342
Adjust latest events updater (#1623)
* Adjust forward elatest events updater

* Populate newLatest in all cases

* Re-add existingPrevs loop
2020-12-09 13:34:37 +00:00
Kegsay b507312d4c
MSC2836 threading: part 2 (#1596)
* Update GMSL

* Add MSC2836EventRelationships to fedsender

* Call MSC2836EventRelationships in reqCtx

* auth remote servers

* Extract room ID and servers from previous events; refactor a bit

* initial cut of federated threading

* Use the right client/fed struct in the response

* Add QueryAuthChain for use with MSC2836

* Add auth chain to federated response

* Fix pointers

* under CI: more logging and enable mscs, nil fix

* Handle direction: up

* Actually send message events to the roomserver..

* Add children and children_hash to unsigned, with tests

* Add logic for exploring threads and tracking children; missing storage functions

* Implement storage functions for children

* Add fetchUnknownEvent

* Do federated hits for include_children if we have unexplored children

* Use /ev_rel rather than /event as the former includes child metadata

* Remove cross-room threading impl

* Enable MSC2836 in the p2p demo

* Namespace mscs db

* Enable msc2836 for ygg

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-12-04 14:11:01 +00:00
Neil Alexander be7d8595be
Peeking updates (#1607)
* Add unpeek

* Don't allow peeks into encrypted rooms

* Fix send tests

* Update consumers
2020-12-03 11:11:46 +00:00
Neil Alexander b5aa7ca3ab
Top-level setup package (#1605)
* Move config, setup, mscs into "setup" top-level folder

* oops, forgot the EDU server

* Add setup

* goimports
2020-12-02 17:41:00 +00:00
Neil Alexander b4c3692dcc
Optimise CheckServerAllowedToSeeEvent (#1602)
* Try to limit how many state events we have to unmarshal

* Comments
2020-12-02 11:45:50 +00:00
Kegsay 6353b0b7e4
MSC2836: Threading - part one (#1589)
* Add mscs/hooks package, begin work for msc2836

* Flesh out hooks and add SQL schema

* Begin implementing core msc2836 logic

* Add test harness

* Linting

* Implement visibility checks; stub out APIs for tests

* Flesh out testing

* Flesh out walkThread a bit

* Persist the origin_server_ts as well

* Edges table instead of relationships

* Add nodes table for event metadata

* LEFT JOIN to extract origin_server_ts for children

* Add graph walking structs

* Implement walking algorithm

* Add more graph walking tests

* Add auto_join for local rooms

* Fix create table syntax on postgres

* Add relationship_room_id|servers to the unsigned section of events

* Persist the parent room_id/servers in edge metadata

Other events cannot assert the true room_id/servers for the
parent event, only make claims to them, hence why this is
edge metadata.

* guts to pass through room_id/servers

* Refactor msc2836 to allow handling from federation

* Add JoinedVia to PerformJoin responses

* Fix tests; review comments
2020-11-19 11:34:59 +00:00
Neil Alexander 20a01bceb2
Pass pointers to events — reloaded (#1583)
* Pass events as pointers

* Fix lint errors

* Update gomatrixserverlib

* Update gomatrixserverlib

* Update to matrix-org/gomatrixserverlib#240
2020-11-16 15:44:53 +00:00
Mayeul Cantan af41fcadc4
Fix Dendrite not backfilling on world_readable rooms (#1575)
The previous implementation was only checking if room history was
"shared", which it wasn't for rooms where a user was invited, or world
readable rooms.
This implementation leverages the IsServerAllowed method, which already
implements the complete verification algorithm.

Signed-off-by: `Mayeul Cantan <oss+matrix@mayeul.net>`

Co-authored-by: Kegsay <kegan@matrix.org>
2020-11-16 10:47:16 +00:00
S7evinK eccd0d2c1b
Implement forgetting about rooms (#1572)
* Add basic storage methods

* Add internal api handler

* Add check for forgotten room

* Add /rooms/{roomID}/forget endpoint

* Add missing rsAPI method

* Remove unused parameters

* Add passing tests

Signed-off-by: Till Faelligen <tfaelligen@gmail.com>

* Add missing file

* Add postgres migration

* Add sqlite migration

* Use Forgetter to forget room

* Remove empty line

* Update HTTP status codes

It looks like the spec calls for these to be 400, rather than 403: https://matrix.org/docs/spec/client_server/r0.6.1#post-matrix-client-r0-rooms-roomid-forget

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-11-05 10:19:23 +00:00
S7evinK d5675feb96
Add possibilty to configure MaxMessageBytes for sarama (#1563)
* Add configuration for max_message_bytes for sarama

* Log all errors when sending multiple messages

Signed-off-by: Till Faelligen <tfaelligen@gmail.com>

* Add missing config

* - Better comments on what MaxMessageBytes is used for
- Also sets the size the consumer may use
2020-10-27 14:11:37 +00:00
Neil Alexander 3afc623098
Fix RewritesState bug (#1557)
* Set RewritesState once

* Check if any new state provided

* Obey rewritesState

* Don't nuke everything the sync API knows when purging state

* Fix panic from duplicate insert

* Consistency

* Use HasState

* Remove nolint

* Clean up joined rooms on state rewrite
2020-10-22 10:39:16 +01:00
Neil Alexander 04dc019e5e
Don't set empty state snapshots 2020-10-21 16:21:36 +01:00
Neil Alexander 534f9a9eb6
Refactor forward extremities (#1556)
* Add resolve-state helper

* Tweaks

* Refactor forward extremities, again

* Tweaks

* Minor optimisation

* Make path a bit clearer

* Only process state/membership if forward extremities have changed

* Usage comments in resolve-state
2020-10-21 15:37:07 +01:00
Neil Alexander 6c3c621de0
Remove invalid state delta check (#1550) 2020-10-20 12:36:16 +01:00
Kegan Dougal 837c295c26 Linting 2020-10-20 12:29:53 +01:00
Kegsay eb86e2b336
Fix sqlite locking bugs present on sytest (#1543)
* Fix sqite locking bugs present on sytest

Comments do the explaining.

* Fix deadlock in sqlite mode

Caused by starting a writer whilst within a writer

* Only complain about invalid state deltas for non-overwrite events

* Do not re-process outlier unnecessarily
2020-10-20 11:42:54 +01:00
Neil Alexander 6e63df1d9a
KindOld (#1531)
* Add KindOld

* Don't process latest events/memberships for old events

* Allow federationsender to ignore duplicate key entries when LatestEventIDs is duplicated by RS output events

* Signal to downstream components if an event has become a forward extremity

* Don't exclude from sync

* Soft-fail checks on KindNew

* Don't run the latest events updater at all for KindOld

* Don't make federation sender change after all

* Kind in federation sender join

* Don't send isForwardExtremity

* Fix syncapi

* Update comments

* Fix SendEventWithState

* Update sytest-whitelist

* Generate old output events

* Sync API consumes old room events

* Update comments
2020-10-19 14:59:13 +01:00
Neil Alexander 7a1fd123de
Improved state handling in /send (#1521)
* Capture errors

* Don't request only state key tuples needed for auth (we end up discarding room state this way)

* QueryStateAfterEvent returns all state when no tuples supplied

* Resolve state

* Comments
2020-10-14 12:39:37 +01:00
Neil Alexander 8001627cfc
Get missing event tweaks (#1514)
* Adjust backfill to send backward extremity with state before other backfilled events, include prev_events with no state amongst missing events

* Not finished refactor

* Fix test

* Remove isInboundTxn

* Remove debug logging
2020-10-12 15:56:15 +01:00
Neil Alexander 8035c50c06
Don't get into situations where we have no forward extremities 2020-10-08 14:46:25 +01:00
Neil Alexander 78f6e1a31e
Don't set new state NID if state regression 2020-10-08 13:30:30 +01:00
Kegsay a846dad0e1
Return what we have when we encounter missing events when servicing backfill/gme (#1499)
We expect to have missing events as we walk back in the DAG over federation
as we didn't always create the room. When checking if the server is allowed
to see those events, just give up and stop rather than fail the request.
2020-10-08 12:22:00 +01:00
Neil Alexander 429bd48129
Return a non-fatal error to the federation API on a state regression (#1498)
* Return a non-fatal error to the federation API on a state regression

* Return no error but don't morph state
2020-10-08 12:13:50 +01:00
Neil Alexander d821f9d3c9
Deep checking of forward extremities (#1491)
* Deep forward extremity calculation

* Use updater txn

* Update error

* Update error

* Create previous event references in StoreEvent

* Use latest events updater to row-lock prev events

* Fix unexpected fallthrough

* Fix deadlock

* Don't roll back

* Update comments in calculateLatest

* Don't include events that we can't find references for in the forward extremities

* Add another passing test
2020-10-07 14:05:33 +01:00
Neil Alexander f7c15071de
Don't return 500s on checking to see if a remote server is allowed to see an event we don't know about (#1490) 2020-10-07 10:30:27 +01:00
Kegsay 0f7e707f39
Optimise servers to backfill from (#1485)
- Prefer perspective servers if they are in the room.
- Limit the number of backfill servers to 5 to avoid taking too long.
2020-10-06 18:09:02 +01:00
Neil Alexander 4feff8e8d9
Don't give up if we fail to fetch a key (#1483)
* Don't give up if we fail to fetch a key

* Fix logging line

* furl nolint
2020-10-06 17:59:08 +01:00
Neil Alexander bf90db5b60
Remove KindRewrite (#1481)
* Don't send rewrite events

* Remove final traces of rewrite events

* Remove test that is no longer needed

* Revert "Remove test that is no longer needed"

This reverts commit 9a45babff690480acd656a52f2c2950a5f7e9ada.

* Update test to use KindOutlier
2020-10-06 11:05:00 +01:00
Neil Alexander 2e71d2708f
Resolve state after event against current room state when determining latest state changes (#1479)
* Resolve state after event against current room state when determining latest state changes

* Update sytest-whitelist

* Update sytest-whitelist, blacklist
2020-10-05 17:47:08 +01:00
Neil Alexander 738b829a23
Fetch missing auth events, implement QueryMissingAuthPrevEvents, try other servers in room for /event and /get_missing_events (#1450)
* Try to ask other servers in the room for missing events if the origin won't provide them

* Logging

* More logging

* Implement QueryMissingAuthPrevEvents

* Try to get missing auth events badly

* Use processEvent

* Logging

* Update QueryMissingAuthPrevEvents

* Try to find missing auth events

* Patchy fix for test

* Logging tweaks

* Send auth events as outliers

* Update check in QueryMissingAuthPrevEvents

* Error responses

* More return codes

* Don't return error on reject/soft-fail since it was ultimately handled

* More tweaks

* More error tweaks
2020-09-29 13:40:29 +01:00
Neil Alexander 3013ade84f
Reject make_join for empty rooms (#1439)
* Sanity-check room version on RS event input

* Update gomatrixserverlib

* Reject make_join when no room members are left

* Revert some changes from wrong branch

* Distinguish between room not existing and room being abandoned on this server

* nolint
2020-09-24 16:18:13 +01:00