Enriching Material Composition Data Using OWL Class-Based Inferencing

This is the fourth in a series of posts about using SHACL to validate material composition data for semiconductor products (microchips). This results from a recent project we undertook for Nexperia. In our first three posts, we looked at how to validate our material composition data:

  • In the first post we looked at the basic data model for material composition and how basic SHACL vocabulary can be used to describe the constraints.
  • In the second post we looked at how SPARQL-based constraints can be used to implement more complex rules based on a SPARQL SELECT query and,
  • In the third post, how aggregates can be used as part of validation rules.

In this post we will venture beyond validating data and consider how we can enrich the data by inferring new, additional statements. Before jumping into the implementation of the inferencing, let’s first look at what we want to be able to infer.

One of the requirements from Nexperia is that we should be able to generate IPC-1752A compliant XML files. IPC-1752A is the materials declaration standard for companies in the supply chain to share information on materials in products. In February 2014, a second amendment to IPC-1752A was published that, amongst other additions, added a new field for use with new IEC 62474 Declarable Substances list to align with IEC 62474; Based on our, now validated, material composition data, we want to automatically infer the IEC 62474 classification of each material.

IEC 62474 defines set of material classes in a tree-like structure:

  • Inorganic materials

    • Metals and Metal Alloys

      • Ferrous alloys
        • M-001: Stainless steel
        • M-002: Other Ferrous alloys, non-stainless steels
      • Non-ferrous metals and alloys
        • M-003: Aluminum and its alloys
        • M-004: Copper and its alloys
        • M-005: Magnesium and its alloys
        • M-006: Nickel and its alloys
        • M-007: Zinc and its alloys
        • M-008: Precious metals
        • M-009: Other non-ferrous metals and alloys
    • Non-metals

      • M-010: Ceramics / Glass
      • M-011: Other inorganic materials
  • Organic materials

    • Plastics and rubber
      • M-012: PolyVinylChloride (PVC)
      • M-013: Other Thermoplastics
      • M-014: Other Plastics and Rubber
    • Other organics
      • M-015: Other Organic Materials

We can define rules to classify the materials based on:

  • The type of the material
  • The composition of the material

These rules are defined by the subject matter experts within Nexperia. They are relatively static, but we can expect that to change over time as new material types and substances are introduced. Examples of these rules are:

  • Any Adhesive is a M-014: Other Plastics and Rubber
  • Any Clip is a M-004: Copper and its alloys
  • Any Component containing at least 50% Lead Oxide is a M-010: Ceramics / Glass
  • Any Component containing at least 40% Iron is a M-002: Other Ferrous alloys, non-stainless steels
  • Any Component containing some C.I. Pigment Violet 23 is a M-015: Other Organic Materials
  • Any Lead Frame containing at least 50% Copper is a M-004: Copper and its alloys
  • Any Lead Frame containing at least 50% Iron is a M-002: Other Ferrous alloys, non-stainless steels

Note the cases where the ‘required’ percentage of Iron differs per material type. We can abstract this into the following generic cases:

  1. Any X is a Z
  2. Any X containing Y is a Z
  3. Any X containing at least N% of Y is a Z

When considering how to capture these rules, the ‘traditional’ answer in Semantic Web circles is to use OWL (the Web Ontology Language), so let’s try it. OWL generally works using Description Logic based inference rules which essentially works by defining sets (classes) of things and how those sets relate to each other.

Any X is a Z

The “Any X is a Z” case can be easily handled using rdfs:subClassOf to relate the material type (a class) to the IEC material class. For example:

plm:Adhesive rdfs:subClassOf iec:M-014 .

So from the statement:

<132253533401> a plm:Adhesive .

We can infer:

<132253533401> a iec:M-014 .

Any X containing Y is a Z

The “Any X containing Y is a Z” case can be handled using OWL class restrictions.

Consider the concrete rule “Any Component containing some C.I. Pigment Violet 23 is a M-015: Other Organic Materials”.

We start by defining a new class ex:ContainsCIPigmentViolet23 based on a restriction on the plm:containsMaterialClass predicate:

ex:ContainsCIPigmentViolet23 owl:equivalentClass [
    rdf:type owl:Restriction ;
    owl:onProperty plm:containsMaterialClass ;
    owl:hasValue <132285000361>
  ] .

This will infer a resource that contains material class <132285000361> is a ex:ContainsCIPigmentViolet23.

Next we can define a new class ex:ComponentContainingCIPigmentViolet23 as being equivalent to the intersection of the plm:Component and ex:ContainsCIPigmentViolet23 classes and being a subclass of iec:M-015:

ex:ComponentContainingCIPigmentViolet23 rdfs:subClassOf iec:M-015 ;
  owl:equivalentClass [
    rdf:type owl:Class ;
    owl:intersectionOf ( plm:Component ex:ContainsCIPigmentViolet23 )
  ] .

In combination with the previous restriction, this will infer a resource that contains material class <132285000361> and is of type plm:Component is a ex:ComponentContainingCIPigmentViolet23 and a iec:M-015.

So from the statements:

<132253533401> a plm:Component ;
  plm:containsMaterialClass <132285000361> .

We can infer:

<132253533401> a ex:ContainsCIPigmentViolet23, ex:ComponentContainingCIPigmentViolet23, iec:M-015 .

Any X containing at least N% of Y is a Z

Finally, let’s consider the “Any X containing at least N% of Y is a Z” case. This is the most complex case and requires a mental backflip or two involving the qualified relationships that carry the mass percentage.

Specifically let’s look at the rule “Any Component containing at least 40% Iron is a M-002: Other Ferrous alloys, non-stainless steels”.

First we define the inferences for the qualified relations. We define a new class ex:MassPercentageMin40 based on a restriction on the plm:massPercentage predicate having an integer value of 40 or more:

ex:MassPercentageMin40 owl:equivalentClass [
    rdf:type             owl:Restriction ;
    owl:onProperty       plm:massPercentage ;
    owl:allValuesFrom [
      rdf:type             rdfs:Datatype ;
      owl:onDatatype       xsd:integer ;
      owl:withRestrictions ( [ xsd:minInclusive 40 ] )
    ]
  ] .

This will infer that a resource that has a mass percentage value of 40 or more is a ex:MassPercentageMin40.

We also define a new class ex:HasIron based on a restriction on the plm:target predicate:

ex:HasIron owl:equivalentClass [
    rdf:type owl:Restriction ;
    owl:onProperty plm:target ;
    owl:hasValue <132285000116>
  ] .

This will infer a resource that with target <132285000116> is a ex:HasIron.

Then we define another class ex:MassPercentageMin40Iron as being the equivalent of the intersection of the classes ex:MassPercentageMin40 and ex:HasIron that we just defined:

ex:MassPercentageMin40Iron owl:equivalentClass [ 
    rdf:type owl:Class ;
    owl:intersectionOf ( ex:MassPercentageMin40 ex:HasIron )
  ] .

This will infer that a resource representing the relationship of a mass percentage value of 40 or more of <132285000116> is a ex:MassPercentageMin40Iron.

Next we define a class ex:MaterialWithMassPercentageMin40Iron based on a restriction on the plm:qualifiedRelation predicate having a value of type ex:MassPercentageMin40Iron or more:

ex:MaterialWithMassPercentageMin40Iron owl:equivalentClass [
    rdf:type owl:Restriction ;
    owl:onProperty plm:qualifiedRelation ;
    owl:someValuesFrom ex:MassPercentageMin40Iron
  ] .

Finally we can define a new class ex:ComponentWithMassPercentageMin40Iron as being equivalent to the intersection of the plm:Component and ex:MaterialWithMassPercentageMin40Iron classes and being a subclass of iec:M-002:

ex:ComponentWithMassPercentageMin40Iron rdfs:subClassOf iec:M-002 ;
  owl:equivalentClass [
    rdf:type owl:Class ;
    owl:intersectionOf ( plm:Component ex:MaterialWithMassPercentageMin40Iron )
  ] .

Put together we can then infer that a Component that contains 40% or more of <132285000116> is a ex:ComponentWithMassPercentageMin40Iron and a iec:M-002.

So from the statements:

<331214891234> a plm:Component ;
  plm:name "3312 148 91234" ;
  plm:containsMaterialClass <132285000116> ;
  plm:qualifiedRelation [
    a plm:ContainsMaterialClassRelation ;
    plm:target <132285000116> ;
    plm:materialGroup "Pure metal" ;
    plm:massPercentage 100
  ] .

We can infer:

<331214891234> a plm:Component, ex:MaterialWithMassPercentageMin40Iron, ex:ComponentWithMassPercentageMin40Iron, iec:M-002 ;
  plm:name "3312 148 91234" ;
  plm:containsMaterialClass <132285000116> ;
  plm:qualifiedRelation [
    a plm:ContainsMaterialClassRelation, ex:MassPercentageMin40, ex:HasIron, ex:MassPercentageMin40Iron ;
    plm:target <132285000116> ;
    plm:materialGroup "Pure metal" ;
    plm:massPercentage 100
  ] .

This demonstrates that it is possible to capture these rules using the class-based inferencing approach of OWL. The sample OWL ontology is available here.

IEC 62474 ontology - Visualization generated using WebVowl

In the next post in the series, we’ll look at how we can use SPARQL to define the same rules.