Many of the reports I run are a mashup of data from multiple sources which may contain millions of data points. Combining these reports often locked my computer up for several minutes making it impossible to do anything while they ran so I needed to look for any way I could cut down the processing time.
The first step seems obvious and that was to cut out any data I didn’t need, but it was also important to be sure this was one of the first steps so that I wasn’t loading, moving or saving more data than needed. Cutting out extra data may be possible with better SQL queries, but often times my data was pre-run during batch or came from third party vendors. As a result I would typically use excel to sort, filter and remove anything I didn’t need.
The second part of this process involved looking at where calculations were being done. It was important to make sure that I wasn’t running calculations each time I added new data. This may mean using a macro to fill in formulas all the data had been combined. It was also important to make sure I wasn’t entering 100,000 lines of formulas when I only had 30,000 lines on the report.
It is also important to look at the order in which formulas are entered. If one formula uses data from another formula take a look at the order in which they are entered into the spread sheet to avoid recalculations.
Reports change over time as different managers request new data points or decide they don’t need some information after all. I found that many old reports contained steps and calculations which were no longer needed but were never removed.
The final step comes as I gain knowledge, which involves looking to see if things can be done in a better way. I have often looked at reports I built 6 months ago and realized how many inefficient methods I used. In excel replacing VLOOKUP with INDEX MATCH can be faster, and in some cases they have been used to check if a piece of data appears on another sheet when COUNTIF would have been much faster.