Commit Graph

70 Commits (df8d6823eefebc70f5c7333978dfd7da0ede1ede)

Author SHA1 Message Date
Neil Alexander 72b3160776
Send-to-device messages over federation (#1198)
* Initial work to send send-to-device messages over federation

* Wire up send-to-device consumer, message formatting

* Generate random message ID

* Review comments, update sytest whitelist
2020-07-14 12:33:37 +01:00
Neil Alexander 08e9d996b6
Yggdrasil demo updates
Squashed commit of the following:

commit 6c2c48f862c1b6f8e741c57804282eceffe02487
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 10 16:28:09 2020 +0100

    Add README.md

commit 5eeefdadf8e3881dd7a32559a92be49bd7ddaf47
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 10 10:18:50 2020 +0100

    Fix wedge in federation sender

commit e2ebffbfba25cf82378393940a613ec32bfb909f
Merge: 0883ef88 abf26c12
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 10 09:51:23 2020 +0100

    Merge branch 'master' into neilalexander/yggdrasil

commit 0883ef8870e340f2ae9a0c37ed939dc2ab9911f6
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 10 09:51:06 2020 +0100

    Adjust timeouts

commit ba2d53199910f13b60cc892debe96a962e8c9acb
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Thu Jul 9 16:34:40 2020 +0100

    Try to wake up from peers/sessions properly

commit 73f42eb494741ba5b0e0cef43654708e3c8eb399
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Thu Jul 9 15:43:38 2020 +0100

    Use TransactionWriter to reduce database lock issues on SQLite

commit 08bfe63241a18c58c539c91b9f52edccda63a611
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Thu Jul 9 12:38:02 2020 +0100

    Un-wedge federation

    Squashed commit of the following:

    commit aee933f8785e7a7998105f6090f514d18051a1bd
    Author: Neil Alexander <neilalexander@users.noreply.github.com>
    Date:   Thu Jul 9 12:22:41 2020 +0100

        Un-goroutine the goroutines

    commit 478374e5d18a3056cac6682ef9095d41352d1295
    Author: Neil Alexander <neilalexander@users.noreply.github.com>
    Date:   Thu Jul 9 12:09:31 2020 +0100

        Reduce federation sender wedges

commit 40cc62c54d9e3a863868214c48b7c18e522a4772
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Thu Jul 9 10:02:52 2020 +0100

    Handle switching in/out background more reliably
2020-07-10 16:28:18 +01:00
Neil Alexander 9cc52f47f3
Use TransactionWriter to reduce database lock issues on SQLite (#1192) 2020-07-09 17:48:56 +01:00
Neil Alexander 99b50f30a0
Reduce federation sender wedges (#1191)
* Reduce federation sender wedges

* Un-goroutine the goroutines
2020-07-09 15:39:35 +01:00
Neil Alexander 2bb580c1b0
Handle case where pendingPDUs might get out of sync for some reason 2020-07-08 15:42:36 +01:00
Neil Alexander af6bc47f16
Squashed commit of the following:
commit b4cb47aa1329d2ada10ae6426fd9d2a69f47536a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Jul 8 14:13:27 2020 +0100

    Restrict transaction send context time

commit 7c28205cdb5d842071d46b1ec599d09cca708e57
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Jul 8 14:00:06 2020 +0100

    Add to gobind build

commit d9e2c72e0576a2eb0ce6ac48eed6cc9d4761a0ea
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Jul 8 13:43:21 2020 +0100

    Wake up destination queues for new sessions/links

commit 21766c6c52bd00511d28981457e9034358c32a8d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Jul 8 13:17:18 2020 +0100

    Tweak QUIC parameters
2020-07-08 14:52:48 +01:00
Neil Alexander de0f427ddc Fix build 2020-07-07 16:54:14 +01:00
Neil Alexander 51fd532940 Fix error handling in federationsender 2020-07-07 16:53:10 +01:00
Kegsay 8e9580852d
bugfix: continue sending PDUs if ones are added whilst sending another PDU (#1187)
* Add a bit more logging to the fedsender

* bugfix: continue sending PDUs if ones are added whilst sending another PDU

Without this, the queue goes back to sleep on `<-oq.notifyPDUs` which won't
fire because `pendingPDUs` is already > 0. This should fix a flakey sytest.

* Break if no txn is sent

* Tweak federation sender wake-ups

* Update comments

* Remove break or that'll kill the parent loop

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-07-07 16:36:10 +01:00
Neil Alexander 46dbc46f84
Wake up destination queues more aggressively (#1183)
* Wake up destination queues more aggressively

* We don't really need Ch here do we
2020-07-03 16:31:56 +01:00
Neil Alexander 1773fd84b7
Hydrate destination queues at startup (#1179)
* Hydrate destination queues at startup

* Review comments
2020-07-03 11:49:49 +01:00
Neil Alexander 38caf8e5b7
Yggdrasil+QUIC demo, federation sender tweaks (#1177)
* Initial QUIC work

* Update Yggdrasil demo

* Make sure that the federation sender knows how many pending events are in the database when the worker starts

* QUIC tunables

* pprof

* Don't spin

* Set build info for Yggdrasil
2020-07-02 17:43:07 +01:00
Neil Alexander 42dd962425
Persistent federation sender queues (PDUs) (#1173)
* Initial work on persistent queues

* Update index for event ID and server name

* Put things into database (postgres for now)

* Duplicate postgres code into sqlite for now just to stop build errors, will fix SQLite soon

* Fix table name

* Fix index

* Fix table name

* Use RETURNING because LastInsertID is not supported by postgres

* Use functions

* Marshal headered event

* Don't error on now rows

* Don't block if there are PDUs waiting

* Try to tidy up JSON

* Debug logging

* Fix query, use transactions in postgres

* Clean up

* Rehydrate more opportunistically

* Fix SQLite

* remove unused types

* Review comments

* Shuffle things around a bit

* Clean up transaction properly

* Don't send empty transactions

* Reduce unnecessary retries

* Count PDUs to make more resilient

* Don't stop when there is work to be done

* Try to limit wakeups

* well this is tedious

* Fix race in incomplete transactions

* Thread safety on transaction ID/count
2020-07-01 11:46:38 +01:00
Kegsay 43cddfe00f
Return remote errors from FS.PerformJoin (#1164)
* Return remote errors from FS.PerformJoin

Follows the same pattern as PerformJoin on roomserver (no error return).

Also return the right format for incompatible room version errors.

Makes a bunch of tests pass!

* Handle network errors better when returning remote HTTP errors

* Linting

* Fix tests

* Update whitelist, pass network errors through in API=1 mode
2020-06-25 15:04:48 +01:00
Kegsay 0dc4ceaa2d
Minor perf/debugging improvements (#1121)
* Minor perf/debugging improvements

- publicroomsapi: Don't call QueryEventsByID with no event IDs
- appservice: Consume only if there are 1 or more ASes
- roomserver: don't keep a copy of the request "for debugging" - we trace now

* fedsender: return early if we have no destinations

* Unbreak tests
2020-06-12 15:11:33 +01:00
Kegsay ecd7accbad
Rehuffle where things are in the internal package (#1122)
renamed:    internal/eventcontent.go -> internal/eventutil/eventcontent.go
	renamed:    internal/events.go -> internal/eventutil/events.go
	renamed:    internal/types.go -> internal/eventutil/types.go
	renamed:    internal/http/http.go -> internal/httputil/http.go
	renamed:    internal/httpapi.go -> internal/httputil/httpapi.go
	renamed:    internal/httpapi_test.go -> internal/httputil/httpapi_test.go
	renamed:    internal/httpapis/paths.go -> internal/httputil/paths.go
	renamed:    internal/routing.go -> internal/httputil/routing.go
	renamed:    internal/basecomponent/base.go -> internal/setup/base.go
	renamed:    internal/basecomponent/flags.go -> internal/setup/flags.go
	renamed:    internal/partition_offset_table.go -> internal/sqlutil/partition_offset_table.go
	renamed:    internal/postgres.go -> internal/sqlutil/postgres.go
	renamed:    internal/postgres_wasm.go -> internal/sqlutil/postgres_wasm.go
	renamed:    internal/sql.go -> internal/sqlutil/sql.go
2020-06-12 14:55:57 +01:00
Kegsay 4675e1ddb6
Add trace logging to RoomserverInternalAPI (#1120)
This is a wrapper around whatever impl we have which then logs
the function name/request/response/error.

Also tweak when we log on kafka streams: only log on the producer
side not the consumer side: we've never had issues with comms and
having 1 message rather than N would be nice.
2020-06-12 12:10:08 +01:00
Kegsay ec7718e7f8
Roomserver API changes (#1118)
* s/QueryBackfill/PerformBackfill/g

* OutputEvent now includes AddStateEvents which contain the full event of extra state events

* Only include adds not the current event

* Get adding state right
2020-06-11 19:50:40 +01:00
Kegsay 25cd2dd1c9
Remove unused internal APIs (#1117) 2020-06-11 15:07:16 +01:00
Kegsay 399b6ae334
Remove federationsender producer, which in fact was not a producer (#1115)
* Remove federationsender producer, which in fact was not a producer

* Set the signing struct
2020-06-10 16:54:43 +01:00
Kegsay 4f171c56a8
Split out SetupFooComponent (#1106)
* Split out adding HTTP routes from making internal APIs for clarity

* Split out more components

* Split out more things

* Finish converting

* internal mux for internal routes
2020-06-08 15:51:07 +01:00
Neil Alexander 76ff47c052
Use AuthChainProvider to try and speed up federated joins (#1100)
* Use MissingAuthEventHandler on performjoin to try and speed up cases where we have missing events

* Update gomatrixserverlib

* Use supplied room version

* Use AuthChainProvider

* Tweaks

* Update gomatrixserverlib

* Signature checks
2020-06-05 11:48:52 +01:00
Kegsay f4c676ccdd
Refactor how federationsender gets created (#1095)
* Refactor how federationsender gets created

* s/httpint/inthttp/ for alphabetical niceness with internal package
2020-06-04 14:27:10 +01:00
Kegsay e7d1ac84c3
Add ParseFileURI and use it when dealing with file URIs (#1088)
* Add ParseFileURI and use it when dealing with file URIs

Fixes #1059

* Missing file

* Linting
2020-06-04 11:18:08 +01:00
Neil Alexander 225b72bd42
Don't reset counters before successful outgoing federation request (#1089)
* Don't reset counters before successful outgoing federation request on incoming federation request

* Comments
2020-06-04 10:54:10 +01:00
Kegsay cfc137652e
Add a way to force federationsender to retry sending transactions (#1077)
* Add a way to force federationsender to retry sending transactions

And use it in P2P mode when we pick up new nodes.

* Linting

* Use atomic bool to stop us blocking on the channel
2020-06-01 18:34:08 +01:00
Kegsay fe5cf6f880
fedsender: de-duplicate without sorting server names (#1073) 2020-05-29 13:50:06 +01:00
Neil Alexander 57841fc35e
Read batches from incoming channels (#1067) 2020-05-27 12:16:53 +01:00
Neil Alexander 6d50212f29
Miscellaneous fixes (#1060)
* Add missing routing for PerformDirectoryLookupRequest

* Tweak output

* Fix some bugs in devices

* Don't default to federated room joins in response to invite

* Update sytest-whitelist

* Update comments

* Return correct room ID from PerformJoin

* Fix appservice and EDU server API setup, update sytest-whitelist

* Update sytest-whitelist
2020-05-26 14:41:16 +01:00
Neil Alexander 06d5f1e6dc Fix API paths 2020-05-22 14:14:39 +01:00
Kegsay 3daa2327ed
dendritejs tweaks for persisting sqlite DBs (#1058)
* Use uri.path so we don't have file: in the filename

* New go-sqlite-js version
2020-05-22 12:28:48 +01:00
Neil Alexander fe82e1f725
Separate muxes for public and internal APIs (#1056)
* Separate muxes for public and internal APIs

* Update client-api-proxy and federation-api-proxy so they don't add /api to the path

* Tidy up

* Consistent HTTP setup

* Set up prefixes properly
2020-05-22 11:43:17 +01:00
Kegsay 24d8df664c
Fix #897 and shuffle directory around (#1054)
* Fix #897 and shuffle directory around

* Update find-lint

* goimports

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-05-21 14:40:13 +01:00
Neil Alexander dce4f436f7
Add -api flag to monolith (#1044)
* Add flag for enabling HTTP APIs in monolith mode

* Flag -api

* Only start HTTP APIs if needed
2020-05-18 10:56:43 +01:00
Neil Alexander 5f6f8adaa5
Don't prematurely stop trying to join using servers (#1041)
* Don't prematurely stop trying to join using servers

* Factor out performJoinUsingServer
2020-05-15 13:55:14 +01:00
Neil Alexander 17d27331a3 Fix 'input to Unique() must be sorted' panic 2020-05-07 17:14:32 +01:00
Neil Alexander a16db1c408
Improve federation sender performance, implement backoff and blacklisting, fix up invites a bit (#1007)
* Improve federation sender performance and behaviour, add backoff

* Tweaks

* Tweaks

* Tweaks

* Take copies of events before passing to destination queues

* Don't accidentally drop queued messages

* Don't take copies again

* Tidy up a bit

* Break out statistics (tracked component-wide), report success and failures from Perform actions

* Fix comment, use atomic add

* Improve logic a bit, don't block on wakeup, move idle check

* Don't retry sucessful invites, don't dispatch sendEvent, sendInvite etc

* Dedupe destinations, fix other bug hopefully

* Dispatch sends again

* Federation sender to ignore invites that are destined locally

* Loopback invite events

* Remodel a bit with channels

* Linter

* Only loopback invite event if we know the room

* We should tell other resident servers about the invite if we know about the room

* Correct invite signing

* Fix invite loopback

* Check HTTP response codes, push new invites to front of queue

* Review comments
2020-05-07 12:42:06 +01:00
Neil Alexander 9b1b095b49
Roomserver perform leave (#1004)
* First pass at PerformLeave

* Fix SQLite bulkSelectEventStateKey

* Update gomatrixserverlib

* Fix bugs

* Tidy a bit

* Satisfy King Linter

* Review comments

* Review comments

* Fix constants in SQLite event state keys table
2020-05-04 18:34:09 +01:00
Neil Alexander 5c894efd0e
Roomserver perform join (#1001)
* Add PerformJoin template

* Try roomserver perform join

* Send correct server name to FS API

* Pass through content, try to handle multiple server names

* Fix local server checks

* Don't refer to non-existent error

* Add directory lookups of aliases

* Remove unneeded parameters

* Don't repeat join events into the roomserver

* Unmarshal the content, that would help

* Check if the user is already in the room in the fedeationapi too

* Return incompatible room version error

* Use Membership, don't try more servers than needed

* Review comments, make FS API take list of servernames, dedupe them, break out of loop properly on success

* Tweaks
2020-05-04 13:53:47 +01:00
Neil Alexander f7cfa75886
Limit database connections (#980, #564) (#998)
* Limit database connections (#564)

- Add new options to the config file database:
      max_open_conns: 100
      max_idle_conns: 2
      conn_max_lifetime: -1
- Implement connection parameter setup on the *DB (database/sql) in internal/sqlutil/trace.go:Open()
- Propagate the values in the form of DbProperties interface via all the
  Open() and NewDatabase() functions

Signed-off-by: Tomas Jirka <tomas.jirka@email.cz>

* Fix wasm builds

* Remove file accidentally added from working tree

Co-authored-by: Tomas Jirka <tomas.jirka@email.cz>
2020-05-01 13:34:53 +01:00
Neil Alexander 908108c23e
Rename FS queue package to internal (#997) 2020-05-01 13:01:50 +01:00
Neil Alexander e15f6676ac
Consolidation of roomserver APIs (#994)
* Consolidation of roomserver APIs

* Comment out alias tests for now, they are broken

* Wire AS API into roomserver again

* Roomserver didn't take asAPI param before so return to that

* Prevent roomserver asking AS API for alias info

* Rename some files

* Remove alias_test, incoherent tests and unwanted appservice integration

* Remove FS API inject on syncapi component
2020-05-01 10:48:17 +01:00
Neil Alexander 64e94e9a6f
Join room support in federation sender (#989)
* Implement PerformJoinRequest

* Rename perform functions

* Check send join response

* Temporary wiring to test federation sender room joins

* Actually pass through the config

* Make sure membership content shows join
2020-04-29 15:29:39 +01:00
Neil Alexander a308e61331
Federation sender API remodel (#988)
* Define an input API for the federationsender

* Wiring for rooomserver input API and federation sender input API

* Whoops, commit common too

* Merge input API into query API

* Rename FederationSenderQueryAPI to FederationSenderInternalAPI

* Fix dendritejs

* Rename Input to Perform

* Fix a couple of inputs -> performs

* Remove needless storage interface, add comments
2020-04-29 11:34:31 +01:00
Neil Alexander 3a858afca2
Loopback event from invite response (#982)
* Working invite v2 support

* Fix copyright notice

* Update gomatrixserverlib

* Add fallthrough

* Add missing continue

* Update sytest-whitelist, gomatrixserverlib

* Update gomatrixserverlib to test matrix-org/gomatrixserverlib#181

* Update gomatrixserverlib
2020-04-28 10:53:07 +01:00
Neil Alexander 3ab8ebf6b8
More invite support (#979)
* Update gomatixserverlib

* Try to build invite stripped state if not given to us

* SendInvite improvements

* Transpose invite_room_state into invite_state.events for sync API

* Remove syncapi debugging output

* Use RespInviteV2

* Update gomatrixserverlib

* Send the invite event as a normal roomserver event too, for incorporating into room (should this be done by the roomserver automatically for invite inputs?)

* Federation sender use invite_room_state, room server try to insert membership state

* Check supported room versions on the invite endpoint

* Prevent roomserver query API from trying to handle requests for stub rooms

* Adding a nolint

* Replace IsRoomStub with RoomNIDExcludingStubs, fix query API to use that instead

* Review comments
2020-04-24 16:30:25 +01:00
Neil Alexander c30b12b5a1
Fix sarama import URLs (#856)
* Fix sarama import URLs

* Update gomatrixserverlib

* Update naffka

* Update naffka

* Update in kafka-producer
2020-04-22 15:26:56 +01:00
Kegsay c1bca95adb
Add SQL tracing via DENDRITE_TRACE_SQL (#968)
* Add SQL tracing via DENDRITE_TRACE_SQL

Add this to `internal/sqlutil` in preparation for #897

* Not entirely
2020-04-16 10:06:55 +01:00
Neil Alexander 067b875063
Invites v2 endpoint (#952)
* Start converting v1 invite endpoint to v2

* Update gomatrixserverlib

* Early federationsender code for sending invites

* Sending invites sorta happens now

* Populate invite request with stripped state

* Remodel a bit, don't reflect received invites

* Handle invite_room_state

* Handle room versions a bit better

* Update gomatrixserverlib

* Tweak order in destinationQueue.next

* Revert check in processMessage

* Tweak federation sender destination queue code a bit

* Add comments
2020-04-03 14:29:06 +01:00
Ben B 955244c092
use custom http client instead of the http DefaultClient (#823)
This commit replaces the default client from the http lib with a custom one.
The previously used default client doesn't come with a timeout. This could cause
unwanted locks.
That solution chosen here creates a http client in the base component dendrite
with a constant timeout of 30 seconds. If it should be necessary to overwrite
this, we could include the timeout in the dendrite configuration.
Here it would be a good idea to extend the type "Address" by a timeout and
create an http client for each service.

Closes #820

Signed-off-by: Benedikt Bongartz <benne@klimlive.de>

Co-authored-by: Kegsay <kegan@matrix.org>
2020-04-03 11:40:50 +01:00