Skip to main content
Warning

Migrations are coming to VA GitHub. See the Migrations section for more information.

Migration Process

The process of migrating to GHEC-US

Migrations will consist of three phases:

  1. Pre-migration: The pre-migration phase will consist of VA teams understanding the migration process, migration and destination environment limitations, and planning for the required downtime and restoration of resources that do not migrate completely. Details on the migration and destination environment limitations are contained in the sections below. Recommendations for pre-migration planning are given in the migration planning section.
  2. Migration: Performing migrations using the provided tools. This phase is detailed below.
  3. Post-migration: The post-migration phase will consist of restoring repository resources that did not fully migrate, ensuring connectivity to the integrations and external resources such as self-hosted runners, and testing Actions workflows and other DevOps processes that teams rely on. Recommendations for post-migration steps are given in the migration planning section.

Performing Migrations

The GitHub team has built IssueOps migration tooling for VA engineering teams. IssueOps is the practice of using GitHub Issues and Actions as an interface for automating workflows. Our IssueOps migration implementation is a wrapper around GitHub’s migration product, GitHub Enterprise Importer (GEI).

The bulk of the migration functionality comes from GEI. You can view GitHub’s documentation on the data that is migrated with GEI and data that is not migrated.

The GitHub team is providing some custom migration extensions to migrate some of these data types that are not supported by GEI, see the Migration Limitations section for more information.

GitHub migrations with GEI are essentially a two step process:

  1. Source repositories and their metadata are exported into two compressed archives, one for Git source and another for GitHub metadata (issues, PRs, repository settings, etc.).
  2. Those archives are then imported to recreate the repository in the destination organization.

The GHEC-US migration-actions repository contains the VA’s migration IssueOps implementation. The migration workflow is as follows:

  1. VA teams will create issues in this repository, using a specific issue template for the desired migration operation. The issue template will ask VA teams the necessary information, such as the repository being migrated and various VA organizational information such as eMASS and VASI IDs. To migrate a repository, VA users will need to have direct access to the repository being migrated (either through a team or direct user permissions).
  2. After the issue is created an Actions workflow will run to verify the inputs and that the user has permissions to perform the migration. The validation workflow will add comments to the issue to relay the validation status and to inform users of the next steps.
  3. Users will be able to run migrations by entering slash commands in the issue comments. A slash command is just a comment that begins with a slash, e.g. /run-production-migration. Users will be able to invoke two types of migration workflows using slash commands: dry runs and production migrations. The only difference between the two is that dry runs do not lock the source repository. Production migrations will lock the source repository so that no further changes can occur in the source.
  4. These slash commands will start Actions workflows to perform the migrations with the selected options. The migration workflows will report status back to the issue through comments.

Migrating a single repository typically takes less than 30 minutes, however repositories with Git or metadata sizes approaching the limits could take over an hour.

The following video shows a short walkthrough of an early version of the migrations IssueOps workflow. Note that the issue form is not yet finalized and is subject to change.

More explicit migration instructions will be available in the GHEC-US handbook.

User Attribution and Mannequins

The GitHub user accounts that created resources in the source organizations will not exist in the destination organizations. Thus, when resources are migrated the user attribution data on those resources (e.g. the user that created an issue) will not be valid for the destination organization. GitHub handles this limitation by attributing user activity to mannequins. Mannequins are placeholders for user activity that can later be reclaimed by GitHub users in the destination organization.

Our migration workflow for the VA will attempt to reclaim mannequins automatically on a schedule. This automatic process has two requirements:

  1. The source user account has a va.gov email address associated with it. When migrating from GHEC, the va.gov email address needs to be verified on the user’s account. When migrating from GHEC-EMU it is not necessary for the va.gov email address to be verified.
  2. The user has already been onboarded to the destination GHEC-US enterprise (with the same va.gov email address)

This process identifies and reclaims any unclaimed mannequins if the associated user can be identified in any of the source systems (GHEC, GHEC-EMU, or GHES).

See the Reclaiming mannequins for GitHub Enterprise Importer documentation for more information on mannequins and the reclaim process.