Where Should Your Code Live Besides GitHub?

This is Part 2 of a series. In Part 1, I made the case that open source licenses protect your code but not the infrastructure around it. GitHub, npm, and the platforms we depend on are all controlled by a single US corporation, subject to US trade controls and corporate decisions we have no say in.

Today: what are the actual options?

I spent time researching every platform, tool, and strategy I could find for keeping a copy of your code somewhere other than GitHub. Some are excellent. Some are promising but immature. Some are traps. Here is what I found.

First: What Are You Actually Protecting?

Before evaluating options, it helps to be specific about what lives on GitHub and what can be moved.

Portable (lives in the Git repository itself):

  • Source code and full commit history
  • Branches and tags
  • Workflow files (though they only run on GitHub)

Platform-locked (lives on GitHub’s servers, not in your repo):

  • Issues and their comment threads
  • Pull requests and review comments
  • CI/CD pipeline execution (GitHub Actions)
  • Security advisories and Dependabot alerts
  • Stars, forks, watchers, and the social graph
  • GitHub Pages configuration
  • GitHub Packages

No mirroring solution preserves everything. The question is which pieces matter most to you and what you are willing to invest to protect them.

Stylized illustration of hexagonal mirrors reflecting different code repository storage options

The Platforms

Codeberg: The Nonprofit Alternative

Codeberg is a German nonprofit running Forgejo (a community fork of Gitea).[1] Hosted entirely in the EU, subject to German law and GDPR. Free to use, funded by donations.

What migrates from GitHub: Git history, issues and comments, milestones, labels, pull requests, releases, and wiki pages. Codeberg’s built-in migration wizard handles this.

What does not migrate: GitHub Actions workflows will be in the repo but will not run. Forgejo Actions uses a similar but not identical syntax, and files go in .forgejo/workflows/ instead of .github/workflows/.[2] Stars and fork counts reset to zero. Dependabot, security advisories, and GitHub Packages have no equivalent.

CI/CD: Two options exist. Woodpecker CI is available but requires manual approval and is described as “provided as-is and might break at any time.” Forgejo Actions is in limited open alpha on Codeberg’s hosted runners, but you can connect your own self-hosted runner without needing a public IP.

The honest assessment: Codeberg is the strongest GitHub alternative for someone who wants a functional forge, not just a backup. The migration path is real and imports most of the non-code artifacts that matter. The tradeoffs are CI/CD maturity and the absence of commercial SLAs. For a personal project collection, these are acceptable. For a project with complex CI pipelines, expect rework.

GitLab.com: The Enterprise Option

GitLab has built-in pull mirroring that can sync from GitHub automatically every 30 minutes.[3] It syncs git data only (branches, tags, commits), not issues, PRs, or wikis.

The catch: Pull mirroring requires a paid plan (Premium or Ultimate). It is not available on the free tier. After 14 consecutive sync failures, the mirror stops and must be manually restarted.

Free tier workaround: You can set up a GitLab CI pipeline that periodically fetches from GitHub and pushes, but this consumes your 400 free CI minutes per month.

The honest assessment: GitLab’s free tier is not a viable automatic mirror. If you are already paying for GitLab, pull mirroring is a solid, low-maintenance option. If you are not, the free tier limitations make it more work than it is worth compared to Codeberg or self-hosting.

sourcehut: The Minimalist’s Choice

sourcehut (sr.ht) is Drew DeVault’s platform, built on mailing lists and git send-email instead of pull requests.[4] Pricing is pay-what-you-can: three tiers from $4 to $12/month, with financial aid available. Contributing to existing projects is free.

The workflow difference: Contributors submit patches via email. Maintainers apply them with git am. There are no pull requests in the GitHub sense. CI uses builds.sr.ht with YAML manifests and full VMs (not containers), including SSH access to running builds.

The honest assessment: sourcehut is philosophically compelling and technically solid, but it requires a workflow shift that most GitHub-native developers will find unfamiliar. As a mirror destination, it works fine for git data. As a collaboration platform, it demands commitment to the email-patch workflow. If you are already comfortable with git send-email, sourcehut is excellent. If you have never used it, expect a learning curve.

The Archives

Software Heritage: The UNESCO-Backed Archive

Software Heritage is a French nonprofit backed by UNESCO and Inria.[5] It automatically crawls public repositories across GitHub, GitLab, Bitbucket, npm, PyPI, and other platforms. The archive currently holds over 437 million projects and 28 billion source files.

What gets archived: The full git repository: all branches, all tags, all commits, complete history. Not just a snapshot of HEAD.

What does not get archived: Issues, pull requests, wikis, CI configuration (as runnable pipelines), or Git LFS objects (only LFS pointers are stored, not the large files themselves).

How often is your repo crawled? It depends on activity and popularity. The Linux kernel is visited every 1-2 days. Less active repos may be visited less frequently. You can check your repo’s last visit via the API:

“Save Code Now”: If your repo has not been crawled recently (or ever), you can trigger an immediate archive at archive.softwareheritage.org/save/. Just enter the repository URL and it queues a visit.

Persistent identifiers (SWHIDs): Every archived artifact gets a SoftWare Hash IDentifier, which is now an ISO standard.[5] The format is swh:1:rev:<sha1> for commits, and the SHA1 is the same as the git commit hash. This means you can reference a specific commit in an academic paper, a legal filing, or a dependency spec, and it will resolve forever regardless of what happens to GitHub.

The honest assessment: Software Heritage is not a forge. You cannot collaborate there, file issues, or run CI. What it does is guarantee that your source code, with full history, survives regardless of what happens to any platform. The cost is zero. There is no reason not to verify your repos are archived and use “Save Code Now” for anything that is not.

GitHub Arctic Code Vault: Not What You Think

GitHub’s Arctic Code Vault is a one-time snapshot taken on February 2, 2020, stored on film reels in a decommissioned coal mine in Svalbard, Norway.[6] 21 TB of data on 186 reels, designed for 1,000-year preservation.

What was captured: The HEAD of the default branch of qualifying public repos. Not full git history. Not issues. Not PRs. Just a snapshot of the code as it existed on one day in 2020.

Can you opt in? No. It was automatic based on activity and star count criteria. There is no mechanism to add new repos to the 2020 snapshot.

The honest assessment: This is a fascinating preservation project, but it is not a backup strategy. You cannot retrieve code from it in any practical way, and it captured a single point in time over six years ago.

Internet Archive / Wayback Machine: A Trap

The Wayback Machine crawls web pages. When it visits github.com/user/repo, it saves the rendered HTML of the GitHub UI, not the git repository data. You get a screenshot of your README, not a cloneable repo.

The honest assessment: The Wayback Machine is not a viable strategy for preserving git history. Do not rely on it for source code backup. It is useful for preserving a human-readable snapshot of documentation, but that is all.

Self-Hosting

Forgejo: Your Own Forge

Forgejo is the same software Codeberg runs. It deploys as a single Go binary, uses SQLite by default, and runs comfortably on a 1 GB RAM VPS for a personal collection of 50+ repositories.[7]

Migration: The built-in wizard imports from GitHub, GitLab, Bitbucket, and several other platforms. It pulls git history, issues, milestones, labels, PRs, releases, and wikis.

Pull mirroring: Set up during repo creation (check “This repository will be a mirror”). Syncs periodically with a manual “Synchronize Now” button. One caveat: you cannot convert an existing repo to a pull mirror after creation.

Push mirroring: Available for existing repos. Supports SSH with Ed25519 keys. Can filter by branch using glob patterns. LFS over SSH is not supported.

Maintenance: Binary updates (download, restart), SQLite backups (one file to copy), and standard OS patching. No automatic update mechanism.

The honest assessment: If you already have server infrastructure, Forgejo is the most complete self-hosted option. The migration path from GitHub is the best in class, and the maintenance burden is minimal. The main risk is that you are now responsible for uptime, backups, and security of another service.

The DIY Approach

git bundle: The Simplest Backup

git bundle is a built-in Git command that creates a single portable file containing your entire repository.[8]

This creates a self-contained file with all branches, tags, and reachable commits. You can clone directly from it:

What is included: All git objects and refs. Full history.

What is not included: Issues, PRs, wikis, CI configuration, hooks, stash, or reflog.

The honest assessment: A bundle file on an external drive, a NAS, or cloud storage (S3, B2, etc.) is the simplest possible backup. It requires no account on any platform, no API keys, and no ongoing service. Automate it with a cron job and you have a reliable, if minimal, safety net. It will not help with discoverability or collaboration if GitHub goes away, but it will keep your code.

GitHub Actions Mirroring: Automated Push to Secondary Remotes

A GitHub Actions workflow can push your repo to Codeberg, GitLab, sourcehut, or any other git remote on every commit.[9] Here is a real-world pattern:

Key details: fetch-depth: 0 is required for full history. Each destination should be a separate parallel job so failures in one do not block others. Use --force to overwrite any divergence on the mirror.

Security considerations: The SSH private key stored in GitHub Actions secrets has write access to the destination. If GitHub is compromised, an attacker could push to your mirrors. Use deploy keys (repo-scoped) rather than personal SSH keys to limit blast radius. Pin action versions to commit SHAs, not tags, to prevent supply-chain attacks.

The honest assessment: This is the most practical automated approach if you want to keep GitHub as your primary while maintaining live mirrors. The irony is not lost on me that the automation runs on GitHub itself. If GitHub goes down, the mirror stops updating. But for the scenarios I described in Part 1 (acquisitions, policy changes, corporate direction shifts), you would have a current mirror to switch to.

What About Radicle?

Radicle is a peer-to-peer code collaboration network built on Git. No central server; repos are distributed across “seed” nodes using a gossip protocol.

The honest assessment: Radicle 1.0 shipped in 2024, but during my research, the radicle.xyz website was unreachable. Real-world workflows I found include error-handling fallbacks like || echo "Radicle sync attempted", which does not inspire confidence in reliability. The concept is sound (true decentralization, no single point of failure), but the project is not yet mature enough to recommend as a backup strategy. Worth watching, not yet worth depending on.

The Comparison

Option Git history Issues/PRs CI/CD Cost Effort Jurisdiction
Codeberg Full Yes (import) Limited alpha Free Low Germany/EU
GitLab (paid) Full No (mirror only) Full $29+/user/mo Low US
sourcehut Full Own format Yes (different) $4-12/mo Medium US
Software Heritage Full No No Free Zero France/EU
Forgejo (self-hosted) Full Yes (import) Self-hosted runners VPS cost Medium You decide
git bundle Full No No Free Low Your desk drawer
GitHub Actions mirror Full No N/A Free (within limits) Low-Medium Destination varies
Wayback Machine No No No Free N/A N/A
Arctic Code Vault HEAD only (2020) No No N/A N/A Norway (permafrost)

So What Would I Do?

If I had to pick a strategy today for my own repos, here is what I would consider:

  1. Software Heritage as a baseline. Verify my repos are archived, use “Save Code Now” for anything missing. Zero effort, zero cost, permanent record.
  2. Codeberg as a live mirror. Either via GitHub Actions push-on-commit, or periodic manual sync. A functional forge I could point people to if GitHub became unavailable.
  3. git bundle on a schedule to local storage. Belt-and-suspenders for the truly critical repos. A cron job, a NAS, done.

That is three layers: an archive, a live mirror, and a local backup. Total cost: zero (or the price of a small VPS if I wanted to self-host Forgejo instead of using Codeberg).

In Part 3, I will set this up for real with my own repositories and walk through the process step by step.


What is your backup strategy? Are you mirroring, archiving, or hoping for the best? Find me on Bluesky or LinkedIn.


Notes

[1] Codeberg e.V. is a German registered nonprofit association (eingetragener Verein) headquartered in Berlin. Founded September 2018, launched publicly January 2019. All servers hosted in the EU. As of November 2025: over 300,000 repositories, over 200,000 registered users. Source: Wikipedia: Codeberg.

[2] Forgejo Actions uses workflow files in .forgejo/workflows/ with syntax that is largely compatible with GitHub Actions but not identical. Codeberg’s documentation states explicitly: “things might not work right away.” Hosted runners are in limited open alpha; self-hosted runners are available and do not require a public IP (the runner polls the Forgejo instance). Source: Codeberg CI documentation.

[3] GitLab pull mirroring syncs approximately every 30 minutes. Available on Premium and Ultimate tiers only; not available on the Free tier. After 14 consecutive sync failures, the mirror is marked as a “hard failure” and must be manually restarted. Only git data (branches, tags, commits) is synced; issues, PRs, wikis, and releases are not. Source: GitLab pull mirroring documentation.

[4] sourcehut is operated by Drew DeVault and contributors. 100% open source (AGPL), can be self-hosted. Contribution workflow uses git send-email and mailing lists instead of pull requests. CI (builds.sr.ht) runs full VMs with SSH access to running/failed builds. Recommended max repo size is approximately 6 GiB (Linux kernel scale). Source: sourcehut documentation.

[5] Software Heritage is operated by Inria, backed by UNESCO. Archive statistics (live, verified May 2026): 437 million+ projects, 28 billion+ source files, 6 billion+ commits. The crawler visits repos on a recurring schedule (the Linux kernel is visited every 1-2 days). SWHIDs are an ISO standard as of 2025. For commits, the SWHID uses the same SHA1 as the git commit hash. “Save Code Now” is available at archive.softwareheritage.org/save/. Source: softwareheritage.org, Wikipedia: Software Heritage.

[6] The GitHub Arctic Code Vault captured a snapshot on February 2, 2020, of the HEAD of qualifying public repos. 21 TB on 186 reels of photosensitive film, stored 250 meters deep in permafrost in a decommissioned coal mine in Svalbard, Norway. Film designed for 1,000-year preservation. No opt-in mechanism; inclusion was automatic based on activity and star count. Source: GitHub Archive Program.

[7] Forgejo deploys as a single Go binary with SQLite as the default database. Official documentation states SQLite is sufficient for at least 10 users and describes Forgejo as requiring “an order of magnitude less resources than other forges.” In practice, a 1 GB RAM VPS handles a personal collection of 50+ repos comfortably. Migration wizard supports import from GitHub, GitLab, Bitbucket, Gitea, Gogs, and others. LFS over SSH is not supported for push mirrors. Source: Forgejo installation documentation, Forgejo mirroring documentation.

[8] git bundle creates a single binary file containing git pack data and ref headers. Using --all includes all branches, tags, and reachable objects. The bundle can be cloned directly with git clone. Supports incremental bundles for ongoing backups. Does not include working tree, hooks, stash, reflog, or platform metadata (issues, PRs). Source: git-bundle documentation.

[9] The GitHub Actions mirroring pattern shown is based on a real-world workflow. Key requirements: fetch-depth: 0 for full history, SSH deploy keys stored as secrets, and ssh-keyscan to accept host keys. Security considerations: pin action versions to commit SHAs (not tags) to prevent supply-chain attacks; use repo-scoped deploy keys rather than personal SSH keys; be aware that any collaborator with write access to workflow files can exfiltrate secrets. Source: verified from hyperpolymath/hypatia repository workflow files.