RADIX is focused on the asset integrity management (AIM) data mining problem and uses a combination of subject matter expertise (SME), records classification and attribution technologies, custom search tools, database management, and data synchronization robots. We target the records and legacy systems and prepare a cost and risk-stratified work plan in alignment with client objectives and budgets. Our goal is to produce to engineering data sets needed for pipeline integrity management and facility pressure vessels at a fraction of the cost typically incurred using standard methods.

Our goal is to improve engineering efficiency by using our advanced methods to establish the data baseline which feeds a number of enterprise applications. Client’s data for projects is retained on our servers and desired data populated into a custom SQL database which can feed other software including PODS, ECM's RBI and PMS software applications.

Asset Integrity Management Records Centric Process

RadixData’s clients are operators of liquids and gas pipelines, gas plants, refineries and chemical plants, as well as the engineering services companies and enterprise software companies involved in the projects.  RadixData supports projects such as:

  • MAOP for liquids and gas pipelines
  • Facility “Fitness for Service”
  • Corrosion Reviews
  • Risk-Based Inspection Reviews

RadixData’s job is to rationalize the data to the objectives of the engagement and deliver a clean data set which can feed other applications. 

AIM Navigation

Define Records and Data in Scope

Asset Integrity Management requires data from multiple sources some of which are vintage paper records and antiquated or proprietary systems. RadixData has a process for identifying those sources, analyzing the content, and developing a process for mining data from those sources. The process is completed when all data is gathered, and a complete data set exists to import into an Asset Integrity Management System. Our clients leverage our expertise in filling addressing the problems with gathering vintage records and data to fill the gaps of missing data. The process defining of records and data in-scope the gathering of the data is iterative and continues until all gaps in data are filled.

Data Investigation and Collection

Radix Data will organize interviews with key stakeholders in the operator’s organization to fully understand record descriptions, search criteria, physical and virtual locations for data and records of interest. The interview yields information held by long time employees and client subject matter experts, providing a history of the types, locations, and nomenclature associated with legacy records. This input feeds the collection process, where collection teams are fielded to defensibly query and gather both physical and electronic records. Collected records and data are processed in our facilities to populate the project database.

Content Analytics

Radix Data performs “Records Viability Analytics” of current metadata to assess the availability of critical records types. Although generally sparse and incomplete, the metadata hits will point to the “low hanging fruit” records which appear to be readily available. This metadata typically includes digital and physical file inventory descriptions and searchable index fields in content management systems. Performing analytics on the metadata creates a profile of the available records, facilitating decisions by the project team to intelligent processing.

RadixData What’s in the Box (WIB) is complimentary to content management systems. WIB can create a detailed inventory expanding the current content management system inventory to broaden the scope of records for processing. This can be done at a low cost with high returns for AIM projects.

Data Mining Objectives

The data mining objectives associated with preparing a highly curated engineering data set for an AIM program typically consist of the following components:

  • Establishing a data schema which meets the needs of the engineering and risk analysis
  • Locating and managing high value records across silos of physical and electronic repositories in a cost-efficient manner, where the records are usually poorly indexed and not searchable
  • Locating complementary data from current and legacy structured data which can be used as validation and/or lookup tables for correlating data
  • Identifying gaps in record coverage and establishing a remediation plan to identify or generate missing records
  • Applying business rules (ex: convert fractions to decimals), low-level engineering tasks (ex: lookup applicable ASME codes) and data normalization.
  • Producing a database containing the data elements collected from the available documents and databases
  • Populating software applications including risk-based inspection (RBI), plant management systems (PMS) and geographical information systems (GIS),
  • Performing trending and anomaly analysis of such as vessel and pipeline inspection points and corrosion
  • Achieving ROI on investments in software and services to perform risk assessment

Data Gathering Process

Our observations of the difficulties associated with current data gathering processes include the following:

  • Poor knowledge of available records needed to complete the engineering and risk assessment
  • Silos of on-site physical records and corporate repositories of images which are poorly organized and indexed
  • Poor quality record images (PDF and TIFF) which are not searchable
  • Large volumes of components (vessels, piping, valves, meter stations, pipeline) which require engineering review
  • Engineering resources are used to perform manual tasks to locate records and perform data entry
  • Poorly defined quality control processes to ensure data integrity
  • Antiquated or Obsolete Systems containing valuable information
  • Proprietary Systems holding data hostage
  • Data is spread across many databases, excel spreadsheets or other systems
  • The table is reflective of the amount of missing data (white columns) needed to perform the risk evaluation (blue columns). Missing data (red cells) is potentially resident in historical records. Operators typically do not have good methods to locate the relevant records.

Business Rules Collaboration

Collaboration to develop the business rules begins at the onset of the engagement and continues throughout the process.  Initial business rule definitions are based on the known available data and determine the primary record source for each data point. As data gaps are realized, secondary and tertiary records are identified to fill the gaps. Business Rules allow for the standardization attributes by defining the unit of measure, the numerical format, the degree of accuracy (decimals), and if calculated how the value is derived. Business Rules are the foundation for creating complete and defensible data. Business Rules clearly define the process for collecting, validating and standardizing data, making the process reliable. The process is repeated for each attribute and therefor traceable.

Data Inventory and Transformation

Source Data from share drives containing PDFs, excel spreadsheets, databases and/or digital images are processed through our Data Inventory and Transformation Process where hash tags are generated for chain of custody, zip file content is extracted and flattened for inventory of each file, removal of blacklisted file types, and de-duplication is performed.  The inventory allows for tracking the files and their pages throughout the entire processing cycle. RadixData provides a DIT report listing the inventory database containing the batch information. The DIT inventory is performed for each drive or collection of data received and can be reported on as a whole for the entire project and by each source.

Records Classification

RadixData performs discreet records classification after the DIT process where we assign each document to a classification based on title and/or the structure of the content on the document.  Where a title is available and can be matched, the document is identified as belonging to a particular classification.  While your organization may contain many unstructured sources, it’s unlikely that all the information is valuable to your organization. By using our services and the power of computer processing, the attribute extraction platform can sort through and find only the information you need. Think of it as an automatic filter for any unstructured data that you are managing.

For example, a U1 is a form managed by Regulators and has a clearly defined title: FORM U-1A MANUFACTURER'S DATA REPORT FOR PRESSURE VESSELS.  In some cases the title is cut off or not legible but can still be classified based on the content of the document and how that content is structured. For example, the following U-1A contains information in a specific order with a label for each. When compared to the FORM U-2A MANUFACTURER’S PARTIAL DATA REPORT (ALTERNATE FORM), though most of the same information is provided the order and structure of the information and labels differ allowing for classification of each as their respective type.

Gap Analysis

RadixData performs a Gap Analysis to determine the presence or absence of key documents. This is a key step in the process.  Often, we find that organizations feel their data is complete when in fact they are missing data.  This can occur when records are labeled incorrectly or misfiled. The process occurs in tandem with the data scraping and quality check processes and is repeated until all desired records and data points are present.  In some instances, the primary record may not be present, but the gap analysis is satisfied according to the business rules by secondary and tertiary documents.  The process is iterative and is determined as complete by our client.  Upon completion of the project a final gap analysis is performed.

Data Extraction

RadixData’s intelligent algorithm is made up of rules written with the experience of processing millions of documents with incomplete, old, and in some instances unstructured data, similar to what your struggles are today. The platform can easily be enhanced, via a customer specific layer, taking advantage of all the existing knowledge and configuring new rules specific to your organization's needs. A powerful feature is the ability to pre-validate information found in your data. Our platform can match existing information within your database. The system also has the ability to pre-validate against standard logic or criteria defined by your business rules. Our automated data extraction software coupled with human verification is able to help your organization achieve the highest level of accuracy.

Business Rules Collaboration

Collaboration to develop the business rules begins at the onset of the engagement and continues throughout the process.  Initial business rule definitions are based on the known available data and determine the primary record source for each data point. As data gaps are realized, secondary and tertiary records are identified to fill the gaps. Business Rules allow for the standardization attributes by defining the unit of measure, the numerical format, the degree of accuracy (decimals), and if calculated how the value is derived. Business Rules are the foundation for creating complete and defensible data. Business Rules clearly define the process for collecting, validating and standardizing data, making the process reliable. The process is repeated for each attribute and therefor traceable.

Import Data into AIM System

RadixData is technology agnostic when delivering the results of an Asset Integrity Project.  All information is formatted to meet the specifications of the AIM System the data is imported into.  The Business Rules also define what and how data is delivered to the AIM System. Since Business rules define the standardization of data, how it is extracted, calculated, formatted and delivered, the results are traceable, verifiable and reliable.

The following are examples of standardized data definitions:

  • Standard Units of Measure
  • Calculation to a Unit of Precision
  • Conversions to meet Standard Unit of Measure

How We Help

Operations & Engineering Support

Several high profile events (and numerous smaller events) have increased the pressure on operators to analyze corrosion trends and safe operating pressures. Often locating, organizing, classifying and indexing the data becomes a critical path challenge for the engineer’s RBI and risk algorithm tools which depend on large amounts of high quality data as an input. Operational options to finding this data include performing field work to validate integrity of metal, seams, corrosion rates and the like and resorting to more “pigs and digs”.

Regulatory Compliance

Operation of these assets falls under several regulatory bodies including the Department of Transportation (DOT), Environmental Protection Agency (EPA) and the Occupational Safety and Health Administration (OSHA). In accordance with these regulations, operators must have procedures and standards (such as ASME and API) for ongoing maintenance, inspection, and reporting obligations associated with operation of their infrastructure, with an emphasis on cost and risk management. Key to the mission is the location and harvesting of design of materials, original construction, repair, testing and corrosion history data in order to establish a baseline of risk.

Risk Management

Operators of midstream and downstream facilities including pipelines, refineries and chemical plants seek to insure the safe and efficient operation of their infrastructure and compliance with federal regulations relating to public safety. In addition, operators seek to improve maintenance planning and CAPEX planning commensurate with risk analysis. The affected infrastructure is often decades old and has been subject to numerous merger and acquisition events, which has resulted in the creation of silos of poorly understood and indexed “dark data”, as well as the effects of an aging workforce who possess the last remnants of historical understanding of the facilities.
RADIX DATA, LLC
1773 Westborough Dr.
Katy, TX 77449
info@radixdata.com