Machine Learning: Making Sense of Unstructured Data and Automation in Alt Investments – Traders Magazine
The following was written by Harald Collet, CEO at Alkymi and Hugues Chabanis, Product Portfolio Manager,Alternative Investments at SimCorp
Institutional investors are buckling under the operational constraint of processing hundreds of data streams from unstructured data sources such as email, PDF documents, and spreadsheets. These data formats bury employees in low-value ‘copy-paste’ workflows and block firms from capturing valuable data. Here, we explore how Machine Learning (ML) paired with a better operational workflow, can enable firms to more quickly extract insights for informed decision-making, and help govern the value of data.
According to McKinsey, the average professional spends 28% of the workday reading and answering an average of 120 emails – on top of the 19% spent on searching and processing data. The issue is even more pronounced in information-intensive industries such as financial services, as valuable employees are also required to spend needless hours every day processing and synthesizing unstructured data. Transformational change, however, is finally on the horizon. Gartner research estimates that by 2022, one in five workers engaged in mostly non-routine tasks will rely on artificial intelligence (AI) to do their jobs. And embracing ML will be a necessity for digital transformation demanded both by the market and the changing expectations of the workforce.
For institutional investors that are operating in an environment of ongoing volatility, tighter competition, and economic uncertainty, using ML to transform operations and back-office processes offers a unique opportunity. In fact, institutional investors can capture up to 15-30% efficiency gains by applying ML and intelligent process automation (Boston Consulting Group, 2019) in operations, which in turn creates ‘operational alpha’ with improved customer service and redesigning agile processes front-to-back.
Operationalizing machine learning workflows
ML has finally reached the point of maturity where it can deliver on these promises. In fact, AI has flourished for decades, but the deep learning breakthroughs of the last decade has played a major role in the current AI boom. When it comes to understanding and processing unstructured data, deep learning solutions provide much higher levels of potential automation than traditional machine learning or rule-based solutions. Rapid advances in open source ML frameworks and tools – including natural language processing (NLP) and computer vision – have made ML solutions more widely available for data extraction.
Asset class deep-dive: Machine learning applied to Alternative investments
In a 2019 industry survey conducted by InvestOps, data collection (46%) and efficient processing of unstructured data (41%) were cited as the top two challenges European investment firms faced when supporting Alternatives.
This is no surprise as Alternatives assets present an acute data management challenge and are costly, difficult, and complex to manage, largely due to the unstructured nature of Alternatives data. This data is typically received by investment managers in the form of email with a variety of PDF documents or Excel templates that require significant operational effort and human understanding to interpret, capture, and utilize. For example, transaction data is typically received by investment managers as a PDF document via email or an online portal. In order to make use of this mission critical data, the investment firm has to manually retrieve, interpret, and process documents in a multi-level workflow involving 3-5 employees on average.
The exceptionally low straight-through-processing (STP) rates already suffered by investment managers working with alternative investments is a problem that will further deteriorate as Alternatives investments become an increasingly important asset class, predicted by Prequin to rise to $14 trillion AUM by 2023 from $10 trillion today.
Specific challenges faced by investment managers dealing with manual Alternatives workflows are:
- The process is slow and expensive, manual data search and entry is time consuming, unpredictable and not scalable.
- Higher incidence for lost or unused data, because email is brittle and the copy-paste process is error-prone.
- The process is fragmented, business data is fragmented across repositories with limited visibility across an organization.
Within the Alternatives industry, various attempts have been made to use templates or standardize the exchange of data. However, these attempts have so far failed, or are progressing very slowly.
Applying ML to process the unstructured data will enable workflow automation and real-time insights for institutional investment managers today, without needing to wait for a wholesale industry adoption of a standardized document type like the ILPA template.
To date, the lack of straight-through-processing (STP) in Alternatives has either resulted in investment firms putting in significant operational effort to build out an internal data processing function, or reluctantly going down the path of adopting an outsourcing workaround.
However, applying a digital approach, more specifically ML, to workflows in the front, middle and back office can drive a number of improved outcomes for investment managers, including:
- Providing real-time access to high-value data insights to enable more informed investment decision-making, faster reporting and better client service,
- Driving a deeper understanding of portfolio risk and exposure by allowing investment managers to make use of more data
- Enabling highly efficient end-to-end workflows that make data visible, accessible, and available the second it enters the enterprise,
- Eliminating the need for outsourcing, enabling firms to maintain exclusive control and security over the proprietary data that forms their vital edge.
Trust and control are critical when automating critical data processing workflows. This is achieved with a ‘human-in-the-loop’ design that puts the employee squarely in the driver’s seat with features such as confidence scoring thresholds, randomized sampling of the output, and second-line verification of all STP data extractions. Validation rules on every data element can ensure that high quality output data is generated and normalized to a specific data taxonomy, making data immediately available for action. In addition, processing documents with computer vision can allow all extracted data to be traced to the exact source location in the document (such as a footnote in a long quarterly report).
Reverse outsourcing to govern the value of your data
Big data is often considered ‘the new oil’ or ‘super power’, and there are, of course, many third-party service providers standing at the ready, offering to help institutional investors extract and organize the ever-increasing amount of unstructured, big data which is not easily accessible, either because of the format (emails, PDFs, etc.) or location (web traffic, satellite images, etc.). To overcome this, some turn to outsourcing, but while this removes the heavy manual burden of data processing for investment firms, it generates other challenges, including governance and lack of control.
Embracing ML and unleashing its potential
Investment managers should think of ML as an in-house co-pilot that can help its employees in various ways: First, it is fast, documents are processed instantly and when confidence levels are high, processed data only requires minimum review. Second, ML is used as an initial set of eyes, to initiate proper workflows based on documents that have been received. Third, instead of just collecting the minimum data required, ML can collect everything, providing users with options to further gather and reconcile data, that may have been ignored and lost due to a lack of resources. Finally, ML will not forget the format of any historical document – from yesterday or 10 years ago – safeguarding institutional knowledge that is commonly lost during cyclical employee turnover.
ML has reached the maturity where it can be applied to automate narrow and well-defined cognitive tasks and can help transform how employees work in financial services. However many early adopters have paid a price for focusing too much on the ML technology and not enough on the end-to-end business process and workflow.
The critical gap has been in planning for how to operationalize ML for specific workflows. ML solutions should be designed collaboratively with business owners and target narrow and well-defined use cases that can successfully be put into production.
Alternatives assets are costly, difficult, and complex to manage, largely due to the unstructured nature of Alternatives data. Processing unstructured data with ML is a use case that generates high levels of STP through the automation of manual data extraction and data processing tasks in operations.
Using ML to automatically process unstructured data for institutional investors will generate ‘operational alpha’; a level of automation necessary to make data-driven decisions, reduce costs, and become more agile.
The views represented in this commentary are those of its author and do not reflect the opinion of Traders Magazine, Markets Media Group or its staff. Traders Magazine welcomes reader feedback on this column and on all issues relevant to the institutional trading community.