Structuring Data for Predictive Analytics

By Socialgist on September 18 2019 on


I wrote this paragraph almost three years ago as an opener to a business planning document.

Data – It is a golden age of data, or at least the start. We’re watching as many new and established firms are liberating, creating or collecting information from processes, people and devices. All of this data is unorganized, opaque and unstructured. 

In the time since then, the pace of firms acquiring, using and distributing data has only quickened. Even though the tempo of data creation has accelerated, I’m still struck by how early this is, in the life cycle of our data industry.

I’ve spent most of my life working with data and applications in financial services. The majority of my time in finance was concentrated on problems in the investment space, helping institutions generate investment returns.

Last week, Socialgist did a little video interview to promote our upcoming role at the Battle of the Quants conference in Hong Kong.  Socialgist has been working intensively with clients and partners to enhance data, not only for finance, but for many other industries. It is extremely exciting time for us, given how much data is currently unstructured and nearly impossible to use.

What we have come to realize is that only a certain small percentage of customers can truly add value and structure to the data we provide.  That said, we’re now embarking on a series of initiatives to add additional ‘meaningful’ structure to our datasets. These enhancements will allow clients to use this data for many purposes; across different and challenging industries.   For example, we are now tagging or enhancing content for companies, products and brands; all the while we’re rolling it up to a standard industry identifier, which makes this data easier to ingest and combine with other data.

So why has data become a critical component of everyday operations for so many companies? For the same reason Andrew Carnegie realized steel was the raw input everyone needed during the rise of US industry in the mid-19th century. Or the way John D. Rockefeller understood oil as the basic abundant energy source that could power all that new industry.  These component materials, once refined and structured, allows us to build and power new business opportunities.  Steel becomes, railroads, buildings and shipping. Crude oil becomes electricity, transportation and manufacturing.

So, what does data become? Let’s ask Andy Ng former Chief Scientist of Baidu.

“What’s slowing down AI adoption? Two problems: scarcity of data and talent. For AI to be meaningful, companies need to feed their algorithms vast amounts of data, which isn’t always readily available. In fact, Ng says some large companies launch products for the payout of data, not revenue, and then later monetize it through a different product.”

Data as a raw input needs to be refined to be useful for analysis and algorithms.  Enhancing our raw data is a choice we’re making to serve our customers better. We look forward to making a few announcements in the coming weeks and months.

Rob Passarella, SVP Business Development