Automated financial batch processing using Excel

James Matchett
7 min readMay 30, 2019

Hi everyone, I thought I’d share a recent development success story from our live services support unit for a UK financial sector fund management organisation client.

This was during the first week of my second 6-month rotation at Kainos and I was really thrilled to be assigned such a large client with intricate systems. The client handles ethical investments for a number of public bodies, and as such manage a lot of financial investment data through their internal application.

The client handles transactions between accounts by manually entering multiple transactions into collections known as “Batches”. When each batch is created, its ID is sent to another banker to be entered again manually. If the two entries exactly match, then the batch is authorised and processed. If they don’t match, the batch is rejected and must be entered and checked again.

The client contacted us to see if we could help them with a problem they were facing. They had a large amount of transactions stored in Excel spreadsheets which had to be entered into the system for processing. If this was to be done manually, it would consume a large amount of effort on top of their daily throughput. As a rough estimate, if It took 30 seconds to fully enter a batch item, 600 would take approximately 5 hours. So, they were looking at between 2 people taking around 5 hours of highly tedious manual entry.

As a result, the client asked Kainos if we would be able to automate the extraction and input of the transaction data from the Excel files into the existing system to save a large amount of time and effort.

The service is written in Visual Basic which I hadn’t had a chance to use before however I found it quick to pick up and thankfully, I had some previous experience with the Excel interop which allows us to programmatically interact with spreadsheets to read and write to them.

Using some sample data that the client had provided to us, I decided to make a very basic program that would simply take a spreadsheet in, read each row into memory, sort each individual transaction into batches and then submit the batches to then be processed by the system’s web service.

The first attempt worked surprisingly well; the program looked at the number of rows it needed to read through the Excel Interop in Visual Basic. It then iterated row by row reading each cell into a new transaction intermediary object which had a member variable for each property of the transaction that we wanted to store.

Once every transaction had been read into memory, we then needed to sort the transactions into batches of 10 or fewer. There was an additional rule that each batch could only be composed of transactions with the same fund code, the last letter or letters of the account number associated with the transaction.

I wrote a really simple sorting algorithm which iterated over each transaction that had been read in to memory and checked if an “ideal batch” existed (with the same fund code and with less than 10 items). If it did it exist the transaction got added to it, otherwise if it did not exist, it created the batch, added the transaction to it and held the batch open for future transactions with the same fund code until the batch was full.

Finally, after all the reading and sorting, each batch was finalised. A “batch header” containing summary information of the batch was created and the completed batch was sent to the system’s web service for further validation. All being well and no errors found, the batch would be processed and each transaction within the batch would be executed!

I couldn’t even begin to imagine the time and effort that would go into completing this process manually. The above estimate of 6 hours per person would be greatly extended by any errors in manual input that would have to be handled and accounted for.

I felt this first attempt worked really well with no obstructive issues and delivered on all of the requirements set out by the client, however I really wanted to make this an efficient and maintainable solution that would be as easy as possible for the client to use. So, I set about making some changes to the overall architecture of the solution.

First of all, I had issues with how the data was extracted from the spreadsheet and how it was held in memory.

The current solution meant that the client would have to supply the columns in the exact same order each time they used the system. The “intermediary object” had hard-defined member variables that meant if any additional fields had to be supported in future, there would have to be changes in multiple locations in code. Additionally, in transactions, not every column was populated, in fact in most transactions a large number of columns were left blank meaning that each intermediary transaction object was wasting memory holding empty values.

I decided to knock out all of these issues in a single change by instead of using hard-defined member variables e.g. “Transaction.AccountTo”, I instead used a hash-map in each transaction object which would store the column name as the key and the value held in the cell as the value component of the hash-map entry.

This simple change meant that we could allow the columns to appear in any order in the spreadsheet as well as flexible omission and inclusion of any columns as needed, greatly increasing usability for the client. This also greatly optimised our memory usage by not wasting memory storing empty values. We could also define our own behaviours by writing a set of “getOrElse” methods which attempted to get a specified column header from a transaction and returned the value if it existed in the spreadsheet row, otherwise it would return a default value, or throw an error if the value was required.

This meant that in the final stages of transforming an intermediary object into a batchItem, we simply queried each transaction’s hash-map for a key (a column value from before) and added it if it existed, or otherwise added the preferred default value if it didn’t exist.

As a result of these additions, if in future, further fields had to be supported, a code change would only have to be made in one location instead of several, I see this as an absolute win for future maintainability.

Secondly, during the process of sorting transactions into batches, I found that the method used to find an ideal batch for a transaction took longer and longer as the application ran. This was because when a batch gains more than 10 items, it is no longer able to accept any more transactions. However, each “complete” batch was still being considered before an ideal batch (at the end of the list) was found, or worse, every batch was iterated over before the program found the only option was to create a new batch.

The solution was surprisingly simple, just have a “growing” list and a “complete” list of batches, when a batch is gaining items it resides in the growing list and once a batch gains 10 items, it’s moved to the complete list. This means when we try to find an ideal batch for a transaction, it only has to iterate over the growing list which retains efficiency over the runtime of the program.

I’m really pleased with the result as not only did the final tool process over 600 transactions in 15 minutes, the client expressed how much they would love to use the tool in future for other areas because of its ease of use and seamless integration with their current system which is everything I wanted to achieve from the beginning and more.

The tool was originally required to process just two transaction types but because of the tool’s ease of use and the adaptability incorporated when reading the spreadsheet, the client can use this for all 11 unitised transaction types. This offers the client much more scope to process large volumes of transactions without the overhead of substantial manual effort, from 15 hours of manual effort to 15 minutes of automated processing is a real achievement!

Thank you very much for reading, I really hope to contribute to more projects in future which directly impact and improve the client’s experience with Kainos.

--

--