Skip to main content
Warning

Migrations are coming to VA GitHub. See the Migrations section for more information.

Migration Process

The process of migrating to GHEC-US

Migrations will consist of three phases:

  1. Pre-migration: The pre-migration phase will consist of VA teams understanding the migration process, migration and destination environment limitations, and planning for the required downtime and restoration of resources that do not migrate completely. Details on the migration and destination environment limitations are contained in the sections below. Recommendations for pre-migration planning are given in the migration planning section.
  2. Migration: Performing migrations using the provided tools. This phase is detailed below.
  3. Post-migration: The post-migration phase will consist of restoring repository resources that did not fully migrate, ensuring connectivity to the integrations and external resources such as self-hosted runners, and testing Actions workflows and other DevOps processes that teams rely on. Recommendations for post-migration steps are given in the migration planning section.

Performing Migrations

The GitHub team is building IssueOps migration tooling for VA engineering teams. IssueOps is the practice of using GitHub Issues and Actions as an interface for automating workflows. Our IssueOps migration implementation is a wrapper around GitHub’s migration product, GitHub Enterprise Importer (GEI).

The bulk of the migration functionality comes from GEI. You can view GitHub’s documentation on the data that is migrated with GEI and data that is not migrated.

The GitHub team is building some custom migration extensions to migrate some of these data types that are not supported by GEI, see the Migration Limitations section below for more information.

GitHub migrations with GEI are essentially a two step process:

  1. Source repositories and their metadata are exported into two compressed archives, one for Git source and another for GitHub metadata (issues, PRs, repository settings, etc.).
  2. Those archives are then imported to recreate the repository in the destination organization.

The GitHub team will create a repository in GHEC-US to contain the VA’s migration IssueOps implementation. The migration workflow will be as follows:

  1. VA teams will create issues in this repository, using a specific issue template. The issue template will ask VA teams the necessary information, such as the repositories being migrated and various VA organizational information such as eMASS and VASI IDs. VA users will need to be members of a migrators GitHub team to have access to this repository. To migrate repositories, VA users will need to have direct access to the repositories being migrated (either through a team or direct user permissions).
  2. After the issue is created an Actions workflow will run to verify the inputs and that the user has permissions to perform the migration. The validation workflow will add comments to the issue to relay the validation status and to inform users of the next steps.
  3. Users will be able to run migrations by entering slash commands in the issue comments. A slash command is just a comment that begins with a slash, e.g. /my-command. Users will be able to invoke two types of migration workflows using slash commands: dry runs and production migrations. The only difference between the two is that dry runs do not lock the source repositories. Production migrations will lock the source repositories so that no further changes can occur in the source.
  4. These slash commands will start Actions workflows to perform the migrations with the selected options. The migration workflows will report status back to the issue through comments. While the issue template will allow users to specify multiple repositories, each repository migration is an independent operation on the backend so it will be possible for some of the repositories to succeed and others fail.

Migrating a single repository typically takes less than 30 minutes, however repositories with Git or metadata sizes approaching the limits could take over an hour.

The following video shows a short walkthrough of an early version of the migrations IssueOps workflow. Note that the issue form is not yet finalized and is subject to change.

User Attribution and Mannequins

The GitHub user accounts that created resources in the source organizations will not exist in the destination organizations. Thus, when resources are migrated the user attribution data on those resources (e.g. the user that created an issue) will not be valid for the destination organization. GitHub handles this limitation by attributing user activity to mannequins. Mannequins are placeholders for user activity that can later be reclaimed by GitHub users in the destination organization.

Our migration workflow for the VA will attempt to reclaim mannequins automatically after every migration. This automatic process has two requirements:

  1. The source (GHEC) user account has a verified va.gov email address associated with it
  2. The user has already been onboarded to the destination GHEC-US enterprise (with the same va.gov email address)

This process identifies and reclaims any unclaimed mannequins if the associated user can be identified, not just those created by the corresponding repository migrations. If a mannequin was not reclaimed during its initial migration due to unmet requirements, it may still be reclaimed in future migrations if the user satisfies those requirements.

See the Reclaiming mannequins for GitHub Enterprise Importer documentation for more information on mannequins and the reclaim process.