I don't use Signal, so I can only repeat what I've heard. That said, my understanding is they could until a recent change and now in most cases they can't...unless by "random people" you include employees of Signal, for whom nothing has changed and they can absolutely see your number and metadata, all neatly packaged in one place with everyone else just a third party doctrine request away for the US government.
...I'll stay #selfhosting Matrix / XMPP.
Just note that "your metadata" actually extends to less things than any XMPP and Matrix homeserver can see.
Signal knows your number, when you last connected, whether your account should be discoverable by phone # , how many devices you have linked, when you created your account, and whether anyone can use sealed sender or not.
If for some reason a message doesn't arrive, they store the encrypted data until it can be sent.
Meanwhile your average matrix homeserver has a full copy of chat history and more (including any other homeserver in a room) not including encrypted contents but including:
For XMPP, its much of the same.
You're right that SIgnal has less metadata than a Matrix / XMPP server. To me it's still a greater risk because it's centralized in one place, making it a far more attractive target to threat actors.
Also, they can send SIgnal a third party doctrine request to give them all of my data without me knowing. If they want to try that with my Matrix / XMPP server, they need to send that same request directly to me and they're getting no one else's data with it.
Decentralization matters.
@Blort I know that decentralisation matters, I advocate for it regularly.
I'm just trying to make it perfectly clear that promoting Matrix or XMPP in place of Signal when Metadata privacy is important is not necessarily the best idea.
Really we're comparing a small amount of metadata for a ton of users in one place versus a larger amount of metadata for a tiny fraction of users (eg millions of users of Signal versus around 5 for one of my servers).
I hear your point about the level of per user metadata stored, and it's valid. My main point is that there's more total metadata available in a single silo, making it a more attractive target. Being run by a third party also makes it more opaque when requests are made.