Sourcehut

[Resolved] Planned outage for all services

Maintenance complete: The issue with pages.sr.ht has been resolved and all services are now available. (10:30 UTC — Aug 3)

Maintenance mostly complete but pages.sr.ht still pending: We have completed maintenance on all services, which can now be expected to be stable, with the exception of pages.sr.ht. We have encountered an issue during the pages.sr.ht upgrades and are addressing it now. (10:00 UTC — Aug 3)

Planned maintenance on August 3rd will cause intermittent outages: Planned maintenance will affect all services, causing intermittent outages that are expected to last between 15 and 30 minutes at most. The total maintenance period should last less than 2 hours. (09:00 UTC — Aug 3)

[Resolved] Unplanned git.sr.ht outage

Snapshot growth caused an outage on git.sr.ht

git.sr.ht’s ZFS snapshots grew to consume all available disk space. This is normally an understood pathology of the server configuration, but due to a change in billing with Twilio, our paging script did not alert the operators to the imminent issue. It seems that there is not a grace period with Twilio; they reported the billing issue to us only yesterday.

The issue with git.sr.ht has been resolved, the bill has been paid, and we are researching ways to avoid this occuring again in the future. (15:00 UTC — May 14)

[Resolved] Planned outage for all services

Planned maintenance has been completed. (16:30 UTC — Feb 8)

Planned maintenance is now underway. (15:00 UTC — Feb 8)

Planned maintenance on February 8th will cause intermittent outages: Planned maintenance will affect all services, causing intermittent outages that are expected to last between 15 and 30 minutes at most. (15:00 UTC — Feb 8)

[Resolved] Spamcop outage

One of our third-party DNSBL services, SpamCop, allowed their domain to expire, presumably as a mistake, and began to return “listed” for all DNSBL checks. We use a DNSBL as an early rejection for spam emails, and this caused 21 incoming emails to be wrongfully rejected. We have removed SpamCop from our list of DNSBLs and filed a ticket to improve our monitoring so that we may catch this sooner. Incoming emails are working now, but be advised that if your postmaster uses SpamCop, emails from Sourcehut will likely be rejected until the issue is resolved. (13:25 UTC — Jan 31)

[Resolved] Email issues

An issue with mail delivery caused some emails to be dropped. Upgrades to our mail server caused authentication to unexpectedly and silently fail, and it took some time to detect. During this period, ticket submission to todo.sr.ht failed, and emails were not forwarded from lists.sr.ht. You can thank Oracle for changing the license terms on Berkely DB, which had wider reaching consequences than we expected. (19:00 UTC — Jan 21)

[Resolved] Network outage

An issue with our upstream ISP caused an outage. Our ISP decided to ring in the new year by unplugging us for 12 minutes last night. Sorry about that. (09:03 UTC — Jan 8)

[Resolved] Unscheduled maintenance for git.sr.ht

The issued was addressed without disruption. While investigating the cause of disk growth, an error with our ZFS snapshot retention policy was found to be the cause. The disk space was trivially reclaimed without incurring an outage. (15:50 UTC — Oct 8)

Advances in the pace of disk usage on git.sr.ht has increased the urgency of a planned migration. On September 18th, we were tracking the growth of git.sr.ht disk space usage up to an expected disk space exhaustion in November. However, in the intervening time, the rate of growth has accellerated and we are now urgently seeking to migrate to a server we have prepped with more storage space. This may cause disruptions as git.sr.ht may become read-only during the migration process. (15:00 UTC — Oct 8)

[Resolved] Planned outage for hg.sr.ht

Maintenance complete. Depending on how soon your DNS server picks up the updates, you should find service restored shortly. (21:00 UTC — Jul 29)

We’re still waiting on the latest data to transfer from the old server to the new. It is taking longer than we expected. The process is about halfway complete. We apologise for the delay. (19:00 UTC — Jul 29)

Maintenance may require more time than expected. (17:50 UTC — Jul 29)

[Resolved] Unplanned git.sr.ht web outage

An unexpected Redis failure caused a partial outage of git.sr.ht

The git.sr.ht web service was partially interrupted today due to unexpected Redis errors. Some pages which rely on Redis for caching returned 500 errors, namely the summary and tree pages.

The issue has been resolved. To prevent its recurrence, we have filed a ticket to make our caching system more tolerant of Redis outages, and added an alarm to detect this kind of issue earlier. (12:07 UTC — Jun 28)

[Resolved] Planned outage for all services

Planned maintenance is complete. (19:07 UTC — Jun 15)

Planned maintenance is underway. (18:00 UTC — Jun 15)

Planned maintenance on June 15th will cause intermittent outages: Planned maintenance will affect all services, causing intermittent outages that are expected to last between 15 and 30 minutes at most. (18:00 UTC — Jun 15)