Skip to main content

Dark data is probably one of the most relevant concepts today that people don’t discuss enough.


Dark data is data which is acquired through various computer network operations but not used in any manner to derive insights or for decision making. The ability of an organization to collect data can exceed the throughput at which it can analyze the data.”

It is data that enterprises collect, but don’t use to its full effect.

This likely understates the problem. It is also messy, unstructured data.

To say that data is messy means that it needs to be preprocessed before it can be used.

“Data preprocessing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviors or trends, and is likely to contain many errors. Data preprocessing is a proven method of resolving such issues.”

Whereas, unstructured data “is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well.”

Large enterprises today have large, growing pools of messy, unstructured data that nobody is using. It’s not a data lake. It’s a series of loosely connected cesspools. This unused data is sitting in legacy systems, often, accessed by too few people.

In the case of procurement, access typically means procurement officers. When a large organization executes an RFP or RFQ, the company pairs up a businessperson, like a program manager, with a procurement officer. The function of the procurement staff is to make sure that the RFP complies with the rules while the program manager is focused on making sure that the acquisition meets the organization’s business requirements.

Practically, this means that the only people that access sourcing data are the procurement staff. Sure, the program manager and whatever committee there is put in place to review the acquisition get to see the proposals that suppliers submit.

But, often, they don’t get to see information from their own organization’s prior RFPs directly; the businesspeople need the procurement staff to get that for them.

Businesspeople also don’t get to see the RFPs issued by other buyers, or the contracts signed by other organizations. Procurement staff often don’t get to see that information. And even if anyone did, they would be hard-pressed to see it in a usable preprocessed, structured format.

The solution to this is to have a layer with a 21st-century user interface that sits on top of the incumbent system, extracting the dark data, preprocessing it, and presenting it in a structured format accessible to a wide constituency across the organization.

For years, people have spoken about the need to make the strategic sourcing process actually strategic, as opposed to the functional backwater it is in practice in many organizations. Here is Ayming talking about the issue in this July 2017 Procurement 2020 report:

“’Procurement continues to become more strategic,’ says Alejandro Alvarez, director of operations performance at Ayming. ‘Companies that have seen the value that can be driven from good procurement would say it is now a strategic priority. Savings are not a particularly high focus. It’s more about service delivery.’”

If procurement staff really want to engage at a strategic level across their organizations, they need to make the information practically usable and available to the point that it influences the way decisions are made about product design, sales and marketing, and finance.

EdgeworthBox is a platform for exactly this purpose. It sits on top of the incumbent procurement processes and brings information to bear for use across the organization. We combine a marketplace with features from financial markets including a central clearinghouse for administration; a central clearinghouse for data; and social networking tools. Think of it as a Bloomberg machine for procurement. We call it “network-based sourcing™.” The data sits in a structured format, adhering to the Open Contracts data standard, partitioned into a public repository and a private repository. The public-side data is shared information about live RFPs, historical RFPs, and historical contract awards, typically from government agencies. The private-side data is organization-specific information about live RFPs currently being considered, historical RFPs and responses, and contracts the organization has signed. Users also get access to one another and to counterparts in other organizations on the EdgeworthBox platform. Let’s talk. We’d love to hear about what you’re doing and how we can help you get more out of your existing processes and infrastructure.

Contact Us

Leave a Reply

© 2022 Homework Fairy. | All Rights Reserved.