The Future of Finance is Written Here.

ChatGPT for Accounting: How Digits is using Generative Machine Learning to transform finance

ChatGPT, one of the largest and most sophisticated language models ever created, has recently become a household topic of conversation. If you've been wondering how this incredible technology can be applied to the accounting and finance space, this is the article for you :)

Welcome to chapter two of our three-part series on machine learning! We kicked off with an introduction to similarity-based machine learning, and how we apply it to accounting use cases at Digits. Today, let's explore generative machine learning and what it can bring to the accounting world.

Generative machine learning has received significant attention because it opens up a completely new field of "AI". It is getting closer to fulfilling the human dream of teaching machines some form of “creativity.” Model architectures like ChatGPT, DALL-E, and T5 have provided solutions to various problems including writing text, generating photo-realistic images, and summarizing complex topics. In this blog post, we are excited to explore machine learning for natural language generation and how we are using these concepts today at Digits.

What is Generative Machine Learning?

Traditionally, machine learning has been applied to classification problems, where you take some text and distill it into different buckets or categories. You can think of the text as being "encoded" into those categories. For years, the dream has been to push beyond that, and train a machine learning model that can actually generate text, rather that just classify it. How might that work?

Researchers began building on this approach by experimenting with model architectures that first reduce information through a model encoder and then “decompress” the information back into human-readable text through a decoder. They made a significant breakthrough in 2017 when they presented an encoder-decoder model architecture called Transformer.

encoder-decoder model architecture called Transformer

The model architecture shown above shows the encoder (left side) – decoder (right side) structure. Over the last few years, researchers further refined this architecture by increasing the number of model weights, which allows capturing more “knowledge” into the model, and by fine-tuning the decoder side to respond to decoder “instructions.” The fact that models can now use “instructions” as model inputs unlocked meta-learning, where a model can generate text for untrained scenarios. For example, we can train a model on translating English-German and English-French, and through “instructions,” the model can then be prompted to translate between German and French.

To generate text for a given input text, the decoder model uses the reduced information as an embedding it obtains from the encoder and the initial instruction to generate the first-word token for the generated text. Then it uses the newly generated token together with the instructions and the embedding to generate the second-word token for the text. This generation loop continues until the decoder has reached its maximum sequence lengths (usually 512 or 1024 tokens) or the decoder produces a stop-token instructing the decoder that any text generated following is considered padding. The generated text will then reflect the model’s response to the input text and the given instruction. Here is an example:

Assisting Accountants with Similarity-based Machine Learning

Just last year, we released Boost to help accountants save time by automating their work. Boost instantly spots inconsistencies in their clients' ledgers, saving time and embarrassment! Every second, Digits sifts through every single transaction and performs a deep analysis. Boost alerts accountants if it finds errors like transactions in unexpected categories and suggests categories for transactions with missing categories.

The simplicity of the product is thanks to the powerful technology we built to make this possible.

With this three-part series, Digits’ Machine Learning team provides a look behind the scenes at how it works.

In this first blog post, we will explain why machine learning is crucial for accounting and how we detect categories for banking transactions with similarity-based machine learning models. In parts two and three, we will dive into how we use machine learning to accelerate the interactions between accountants and their clients.

Why Machine Learning?

Machine learning is a versatile tool for many applications, including accounting. For example, if we want to categorize transactions correctly, we can look at similar transactions and mimic their existing categorizations. We could find highly-similar transactions through traditional statistical methods like determining the Levenshtein distance between the transaction descriptions, but those methods would have failed in the following scenarios:

Finding similar transactions with Machine Learning

Because of the number of failure cases of traditional statistical methods, we decided to develop a custom machine learning-based solution.

Architecting Digits Search: Real-time Transaction Indexing With Bleve

We recently launched Digits Search, which brings fast, beautiful, full-depth search to business financial data. We’ve received a ton of great customer feedback, but one question just keeps coming up…

How did you build this!?

Well, let’s break down how Digits Search puts your finances at your fingertips.

Digits Architecture

First, some quick background on Digits:

Digits links with your business’ accounting software and your financial institutions to build, and continually maintain, a living model of your business with the most up-to-date data. Once linked, we ingest all of that financial data in raw form, a collection that we call “facts”.

We then use machine learning and data processing to normalize all of that overlapping, unstructured data. We perform a significant number of calculations to fill holes in the picture, such as predicting how your latest transactions will be categorized, detecting and predicting recurring activity, etc.

The end result of all this work is a “view”, which is then efficiently loaded into Google Cloud Spanner as well as encrypted and archived in Google Cloud Storage for secondary processing.

Each view we produce is a complete, standalone picture of your company’s entire financial history.

Views are served by our serving layer, which is composed of a number of services communicating over TLS-encrypted GRPC APIs, written in Go and hosted in GKE. Our serving layer aims to optimize for efficiency, security, and reliability.

Architecting Search

Training and Deploying State of the Art Transformer Models at Digits

Understanding banking transactions as they happen, in real-time, is core to our mission with Digits Search. You can’t answer important finance questions with bad data.

Transaction descriptions contain valuable information which helps us understand and communicate our customers’ business activity. The information we extract is then indexed and made available via Digits Search, and presented in a far more human-readable and intuitive manner than they would get from reviewing their raw bank or credit card statements.

Here we wanted to share a peek behind the curtains on how we extract transaction information with Natural Language Processing (NLP) at Digits. You’ll learn how we apply state-of-the-art Transformer models to this problem and how we go from an ML model idea all the way to a production integration with our Digits Search product.

Our Plan

Information can be extracted from unstructured text through a process called Named Entity Recognition (NER). This NLP concept has been around for many years, and its goal is to classify tokens into predefined categories, such as dates, persons, locations, and entities.

For example, the transaction below could be transformed into the following structured format:

Named Entity Recognition in action

We had seen outstanding results from NER implementations applied to other industries and we were eager to implement our own banking-related NER model. Rather than adopting a pre-trained NER model, we envisioned a model built with a minimal number of dependencies. That avenue would allow us to continuously update the model while remaining in control of “all moving parts.” With this in mind, we discarded available tools like the SpaCy NER implementation or HuggingFace models for NER. We ended up building our internal NER model based only on TensorFlow 2.x and the ecosystem library TensorFlow Text.

Building Digits: Efficient Serving of Static Views with Google Cloud Spanner

At Digits, we strive to push the bounds of technology in order to deliver radically more useful, delightful software experiences for our customers. We’re excited to begin sharing a closer look at the technical foundations that underpin our products in a new series of blog posts called Building Digits. Without further ado…

Let’s talk about viewing complex data. One of our primary goals at Digits is to provide business owners with insightful and holistic views of their company’s finances, in substantially real-time.

Achieving this involves three major, independent steps:

  1. We collect all of their relevant data from various sources, such as their QuickBooks, the financial institutions they bank with, their corporate credit card providers, etc.
  2. We apply our algorithms and proprietary datasets to extend, interpret, and tease out meaning from all of their data.
  3. We consolidate and aggregate the results into a holistic view that we then visualize for them on their dashboard.

Facts

We refer to the pieces of data that we receive from third party systems as Facts. This is not a judgement of the credibility or immutability of these systems, but rather a delineation of what is (and what is not) under our control.

For example, if we receive a transaction from an external source that looks like 05/12/20 - Taxi $15.05, we might classify it as Transportation. Later, we may receive another piece of information that leads us to believe that this transaction was actually a client expense, and is better classified as Meals & Entertainment. In this example, the transaction itself, the fact, did not change—but our interpretation of it did.

Computed Data

We refer to insights and analysis that are performed by Digits, based on all of the Facts that we have received, as Computed Data. In the example above, this involved a category classification of a transaction. In other cases, this might involve determining that two external pieces of information actually represent the same physical transaction, or detecting that a particular transaction tends to recur on a regular interval and that it should be treated as a subscription.

  • <<
  • Page: 1
  • Page: 2
  • Page: 3
  • >>