How to remove a file completely from your repo history and not to die in the process

One of the things I am grateful for is the existence of tools like GIT or SUBVERSION, which come to solve diverse and serious conflicts when developing (programming) as a team; even if you are developing solo (which is my case). It is best to follow “best practices” and use GIT to keep a precise record and control of your applications’ various versions.

Today, I would like to share with you something that gave me a headache for three hours. Here’s my experience:

I was developing an application on LARAVEL. For some reason I made a backup by compressing my files, thus generating an Archivar.zip file. Without really realizing, when I finished the changes I was carrying out in my code, I decided to make COMMIT; therefore, this file became part of the COMMIT. When I realized, I added the required lines to the file “.gitignore” in order to ignore ZIP files, and generated a new COMMIT. Up to this point everything was peachy, since I hadn’t published (GIT PUSH) my COMMITS to my remote repository and the problem hadn’t surfaced. I kept on working, made some more changes to my code and more COMMIT. Up to this point I had generated 3 more COMMIT since the creation of my ZIP files.

Problem began when I tried publishing my COMMITS to my remote repository; that’s when GITHUB told me:

“File Archivar.zip is 187.18 MB; this exceeds GitHub's file size limit of 100.00 MB”

I had never before come across such a message; I was dazzled and wondered “What now?”.

I began researching, installed GIT LFS to manage large files, but no success; my problem lied on the fact that I was unable to sync my local repository with the remote one.

I tried various ideas: positioning the HEAD to the COMMIT before generating the .ZIP files to be ignored at the following COMMIT, but the more I tried, the complicated the situation became.

During my research, I found a tool that finally solved the problem in less than 5 minutes (it took longer to download the Java SDK, no kidding). I’m sharing the link with you in hopes that it can be useful, should you ever find yourself in a similar situation.

In a nutshell, what this marvel does is eliminate every record of a file on your GIT repository, be it large files, sensitive information files, etc.

This tool is a faster alternative to the use of the git-filter-branch command.

Once the tool was applied, I was able to sync my repository without any problem. Well, here’s that link and I hope it will be useful to you as it was to me.

https://rtyley.github.io/bfg-repo-cleaner/

Cross-Post: http://amiagencia.com/blog/technology/life-or-death-git/