Semantic Understanding of Unpredictable Data
Traditional approaches to data integration and data quality have focused on recognizing common syntactic patterns in the data and then coding rules specific to the situation. This forces programmers to anticipate every variation in the data — a frustrating, pervasive and ultimately ineffective task.
What sets the DataLens™ System apart from traditional methods is that the Data Lens is a semantics-based approach which can identify key information no matter how it is presented. The benefit for the user is the ability to automate, standardize, and understand complex, unstructured data in real-time.
Data Lenses do the hard work of managing all the complexities and infinite variability of product data. Data problems that in the past would have required extensive manual effort or custom code can now be eliminated with a more ‘intelligent’ and flexible approach. At their core, Data Lenses interpret and restructure product data:

- Understand Any Input – Data Lenses use patented technology to understand product data at the semantic level irrespective of format or structure, language or domain. Because they are based on semantic recognition (not syntactic, or pattern based), data lenses are reusable across the enterprise and beyond. Once a particular domain is understood, it is understood forever – irrespective of changes in word order, spellings, abbreviations, punctuation, etc. Because the semantic technology establishes meaning within context, ambiguities can be avoided and key information can be identified and extracted — even from within descriptions or free text fields (both common product data integration challenges that would otherwise require significant programming or manual effort).
- Deliver Any Output – Data Lenses transform product data on the fly — standardizing, cleansing, enriching, classifying and translating as required:
- Standardize – to any format or structure including attributes, units of measure, long descriptions, short descriptions; and any file format, mapping or field standardization for integration purposes.
- Classify – to any public schema such as UNSPSC, eClass, Federal Supply Class, etc. as well as any customized version or proprietary hierarchy.
- Translate – from any language to any language, including all double-byte languages.
- Measure Data Quality – each line of product data is compared to domain-specific product data standards (or requirements). Missing or invalid data is highlighted, ready for correction or exception management in the Data Service Application (DSA), assuring that only reliable data is passed between systems and users.

