work     about    archive    research

Open Books

IQ Bot

how to train a machine to recognize complex contents

Overview - what is IQ Bot

In this information era, data is the key to almost all business. Traditional ways of processing information and recording data show increasing clear shortages: slow, high cost, high error rate, limited speed, and operation hours. If we can train a machine to read contents, fetch information, and record data in the desired format, we can free so many pairs of eyes from reading claims, balancing accounts, confirming client information, etc. It can run 24/7, much cheaper than human resources, and once properly trained; it keeps high performance all the time, with a very low error rate. The question is, how can we properly train a machine to process data, especially with complex contents? 

MVP in 6 weeks

The first version of the product was developed in a rush. In order to compete with a major competitor, we needed to deliver a fully functional product in 6 weeks, including UX/UI design, development, and debugging, which actually squeezed my first round design time to about only one and a half weeks. It is a quite complex product and took me almost 3 days to just understand the requirements. The first version was far from perfect or tuned, but I got to deliver the MVP in a short amount of time, so I focus on the key features and compromised on the UI details.

I broke down the whole process into three modules: layout design, blueprint (data structure) design, and result validation (testing). Then I solved the smaller issues module by module. Below is the overall idea of how the product logically works.

The first module is layout design, which identifies the location of each data that we want to catch. I group all the navigation tools in the top of the interface, and leave the design space as big as possible.

 

For each layout elements, they need to first be mapped on the document, and a user can then modify their attributes. After an element be mapped and defined, it needs to be listed and grouped into three categories, which are markers (usually logos, special images), fields (price, date, address, etc), and tables (which needs to be broken down into columns). Below is the first version of Layout Designer: 

Layout Module

After the layout be mapped out, we need to define the data structure of this sample. The reason is, in order to load all captured data into the database, each data needs to be defined properly to suit database requirements. For example,

  • each field will need a unique identification number, then when we store thousands of data into the database, we can easily find the value of a specific field by calling its identification number.

  • Also, we need to give each data a type, for example, "Invoice date" would be a "Date" data type,

  • then when the machine captures data we can verify if it is correct. Say one document has a date field "12-13-2018", and if the machine reads it as "12132oL8", then it will automatically reject the value, and warn the user of low confidence on it since it is not a valid date.

 

Below is the data structure designer, we also called it the "Blueprint":

Blueprint Module

​After testing on several more sample documents, the bot can be put into production. For example, an Invoice bot can read thousands of similar invoices, get all the important information, and automatically store them into an accounting system. Below is a marketing demo, shows how our IQBot liberating today's knowledge workers from tedious data fetching to only handle exceptions, make important decisions only human can make, and devote more time in high order business value.

What now?

On the latest industrial conference "IMAGINE London", our product team proudly demoed the latest design I delivered. For the past two years, IQBot becomes smarter and smarter, users don't need to manually design the layout. Instead, the engine does it for you since it already learned a lot and can automatically put tons of documents into different categories. Below are some of the typical screens we demoed on the conference, the sample document used in it is made-up since real customers documents can't be shown.

Create Learning Instance
Create Learning Instance

Now, user can pick preset domain template and their interested data to start the training. We call this a learning instance.

Configure learning instance
Configure learning instance

After initial selection, a much cleaner design interface appears to allow users to configure attributes for each item.

Natural Language Processing
Natural Language Processing

We are now also aiming to read more unstructured data, such as emails, letters. This will make our bot much more versatile.

Create Learning Instance
Create Learning Instance

Now, user can pick preset domain template and their interested data to start the training. We call this a learning instance.

1/5

Look back

Other than the above improvements, we also have new directions in the future. For example, be able to read more complex data, like signatures, handwriting, a set of checkboxes, radio buttons, etc. The more we train our bots, the more samples we gathered, the faster the process will be. I believe, in the near future, our system will not only grow exponentially in processing speed and accuracy but also can dive into much deeper layers of complexity and much-complicated situations. 

 

What I learned from the journey of designing IQBot:

  • Design is first about logic, and then beauty.

This not means I need to be a programmer or PM to dive into the technical details, but means as a designer, I must understand the fundamental logic and purposes of a product, to make sense in the design.

  • Nail down the MVPs, and details will come along.

I used to be a perfectionist, and I would like to think through every tiny detail of user flow. This is good, but I also need to learn, "Balance", "Trade-off", and "Iterations". We sometimes compromise because of timing and other pressure on resources, but also only when you push it to the production and use, you can get real feedback, and thus target the problems precisely, and then make it better. So don't be afraid of not being perfect.

  • Don't stop dreaming.

When I first started this project, machine learning is not as mature as today, but it grew so fast in the past two years, and I can't even imagine what we can do today back then. Two years ago, if someone asked me how to improve the product, I would probably say, make the manual mapping process much smoother, and adding more friendly tools. But today, we can totally skip the manual mapping step, and make it happen automatically! So really, I should let my imagination go wild!