Recreating a Power BI GIT Repos in Azure DevOps

In a previous blog, I showed how to shrink a GIT repos using git commands and the opensource bfg-cleaner to remove history of large PBIX files.
https://prodata.ie/2021/10/23/shrinking-azure-repos-bloat-due-to-powerbi/

Sometimes the patient is just to sick and its too messy to clean-up, so we nuke from orbit. Essentially creating a fresh repos and getting developers to re-clone. Sure you loose history and comments, but you can archive that to another repos.

Anyone have a better way ?

Steps to delete and recreate repos

1. Make sure no WIP

Get all BI developers to check in work, and make sure no active branches, Ideally everything in main. But you can also do this if say a dev branch by copying both branches.

Also, make sure your local copy is synced with latest version.

2. Rename old Repos in Azure Devops

From the repos menu select the option to rename and rename to PowerBI_YYYYMMDD as a backup to keep check in comments, etc.

Rename your local repos to match.

3. Create a new Repos

Create a new empty repos with same name as the old one, copy in the old contents (latest version), but not more than 5GB at a time.

Sync this back up to cloud repos and then everyone can use the new repos. They may want to just re-clone it if the old one was say 50GB and the new one only 5GB to save time on a 20 hour local git compress.

How long does it take ?

For 5GB of Power BI files and a 60GB bloated repos it took under 30 mins.

  • 1 minute to rename old repos and create/clone new one
  • 1 minute to copy the 5GB of content locally into new repos
  • 8 minutes to do the first check in. Lots of local disk activity to the _git folder with object history in it.
  • 10 minutes to do the git delta compression and push up to new repos, writing at about 200-500 MBit to cloud repos.

This resulted in a final clone size of 11.3 GB with content of 5.7GB. Your git clone will always be at least double the size of content as it keeps history plus working copy on disk.

All the developers then had to re-clone the repos (15 mins).

Gotchas and Errors

One stumbling block in recreating a repos this way is Azure limits a check in to 5GB, so you may have to add the files in smaller batches to make sure one check in is not > 5GB

Error Message below: This push was rejected because its size is greater than the 5120 MB limit for pushes in this repository. Learn more at https://aka.ms/gitlimit)


Leave a Reply