It's Okay to Write Crap Code
Not every program needs unit tests and great documentation. Learn when you can and cannot get away with this.
I know. What the heck. For those who don't know a few months ago, I wrote the article, “Stop Writing Crap Code”. It is one of the best articles I have written. Today, I am going to make a small revision: it is okay to write crap code, occasionally.
This past week many different websites have removed their subscription requirement and opened their content to the public. This is because of the coronavirus quarantine. Being the data junkie that I am I saw this opportunity and decided to run with it. The only problem? The clock is ticking. I needed to scrape all the content I could in the least amount of time possible.
When to Write Crap Code
First, not every application you write will need to last forever. When you are first learning to code you should just focus on solving problems and lots of them. You will start to understand why and where you need to use good coding practice.
Second, sometimes you just need your program to run once to save you from lots of manual labor. The example I will be showing today is from a program I wrote to scrape data off a magazine archive. They have 418 issues with over 38014 pages. I wouldn't want to manually archive that. Here is where you can get away with safely writing crap code.
What Crap Code Looks Like
So what makes my code crappy? Let’s take a good look!
1. Duplication
Crap code contains lots of code duplication. Take the above examples. I have the same code to create a directory in two separate places. The funny thing is in the save_image function should have never included the created directory code. The save_metadata function should be the only place I create a directory as I have to have metadata before I save images.
Fixing the above code would be simple because they are actually using the exact same code. More often though you will need to ensure your function is flexible enough to handle the different use cases.
2. Functions Too Long and Do Too Much
*Please do not read all this code. Just scroll to my comments on it.
I'm sorry you had to see that. This is to make a point though. A function should never be so long that you need to scroll to see all of it. This just makes it impossible to debug. Along with the length, there is too much going on in this function. Here is a list of everything that happens in this function:
- Iterating Article Links
- Requesting Article Cover Page
- Parsing Metadata
- Parsing Image URLs Out
- Iterating Page Links in Article
- Requesting the Article Page
- Parsing Image Urls Out Of Article Pages
This is a nightmare because if any of the above pieces were broken where do I go to fix it? Somewhere in the main function ¯\_(ツ)_/¯
To fix the above code I would probably first break it down into 7 functions listed above. Then, after looking back over it again I would see that several of the above functions have similar bits and would extract that into its own functionality. Take for example the requesting the article cover page and the article page. They both use code to check if the request was successfully returned. That should be extracted into its own function.
3. Poor State Management
Believe it or not, it didn't start like this. It started with a simple functional program… okay, so I started with a program at least. Partway through I realized that in the event of a failure I wanted to be able to reload a majority of the state so that I could start back where I left off. I was glad I did this because I did have to a few times.
Again though, all this state is not managed well. If this were built better I would break down the state into smaller chunks. There should be one to manage the session state and another to manage the article/page state.
Why This Program is Okay
I was able to write this program in a few afternoons. It was able to scrape 41Gb of data from Magazine X. Now that I am done I will probably just put this in my finished projects folder and let it rot. The lessons I learned from this will not go to waste though. Making it taught me a lot about downloading images from a website and handling sessions.
Not every program you write has to be the best ever. It's okay to write disposable code. Just know if you do have to maintain your code it is worth it to make it right.
If you would like to take a crack at refactoring this ugly code? Fork my repository and give it a go!