Concept Graph API

Concept graphs are collections of Node instances connected by edges. The edges represent inferences and are instances of the Inference class. Virtually all interaction with the graph takes place using the ConceptInstanceGraph class. While the nodes and edges are exposed when extracting mining results, they are not commonly accessed directly.

Mining is initiated using seeds. A seed is an initial graph node from which a concept instance is grown by iterative associative reasoning, taking the seed as starting point.

The graph can be mined in two different ways. Either the graph is mined from a specific seed selected by the API user. Alternatively, the API is requested to auto-select the most promising seed. Automatic seed selection and mining can be repeated to obtain a set of concept instances that ‘covers’ the entire graph, yielding a complete set of knowledge extracted from the event data.

Obtaining concept mining results is done by means of the extract_result_set method of the ConceptInstanceGraph class. It returns a MinedConceptInstanceCollection instance which is covered in detail here.

Class Documentation

The class documentation can be found below.

ConceptInstanceGraph

class edxml.miner.graph.ConceptInstanceGraph(ontology=None)

Bases: object

Class representing a graph of concept nodes. The graph can contain information about a single concept instance or about multiple concepts. Depending on the graph topology these instances may or may not be related.

add(node)
Parameters:node (Node) –
mine(seed=None, min_confidence=0.1, max_depth=10)

Mines the graph for concept instances. When a seed is specified, only the concept instance containing the specified seed is mined. When no seed is specified, an optimum set of seeds will be selected and mined, covering the entire graph. The algorithm will auto-select the strongest concept identifiers spread across the graph as seeds. Any previously obtained concept mining results will be discarded in the process.

Concept instances are constructed within specified confidence and recursion depth limits.

Parameters:
  • seed (EventObjectNode) – Concept seed
  • min_confidence (float) – Confidence cutoff
  • max_depth (int) – Max recursion depth
find_optimal_seed(max_taint=0)

Finds and returns the optimal seed for constructing a new concept instance or None in case all nodes are badly tainted. The taint of a node is the confidence of the node being part of any previously mined concepts. A seed is considered optimal if it is a strong identifier of a concept.

Parameters:max_taint (float) – Node taint limit
Returns:
Return type:Optional[EventObjectNode]
extract_result_set(min_confidence=0.1)

Extracts the concept mining results from the graph, skipping any results that have confidence below specified threshold.

Parameters:min_confidence (float) – Confidence threshold
Returns:
Return type:MinedConceptInstanceCollection
reset()

Clears artifacts of previous concept mining from the graph

Node

class edxml.miner.Node(object_type_name, value, confidence)

Bases: object

object_type_name = None

The name of the object type associated with the node

value = None

The object value that is represented by the node

confidence = None

Confidence of the node

time_span = None

Time line of node confidence

reason = None

The reason of a node is a reference to one of the edges which was used during reasoning to arrive at this node.

conclusions = None

The conclusions of a node are references to zero or more of its edges which were used during reasoning to infer other nodes.

add_inward(edge)

Adds specified edge as an inward edge.

Parameters:edge (Inference) –
add_outward(edge)

Adds specified edge as an outward edge.

Parameters:edge (Inference) –
get_inferences()
Returns:
Return type:Iterable[Inference]
get_inter_concept_inferences()
Returns:
Return type:List[Inference]
get_intra_concept_inferences()
Returns:
Return type:List[Inference]
get_same_concept_inferences(seed, min_confidence)
Parameters:
  • seed (Node) – Concept seed
  • min_confidence (float) – Minimum confidence
Returns:

Return type:

List[Inference]

clear_edge_roles()

Clears the roles that the edges play as either a reason or an argument. These roles are specific to the perspective of a particular seed.

reset()

Resets the state of the node to its initial state, clearing the edge roles, marking the node as unvisited, and so on.

EventObjectNode

class edxml.miner.node.EventObjectNode(event_id, concept_association, object_type_name, value, confidence, time_span)

Bases: edxml.miner.node.Node

Node representing a single instance of an object value.

Parameters:
  • relation (edxml.ontology.PropertyRelation) –
  • node (EventObjectNode) –

Inference

class edxml.miner.inference.Inference(source_node, target_node, confidence)

Bases: object

An edge in a concept instance graph representing the inference of a relation between two nodes.

Parameters:
reason(seed, confidence)

Performs a reasoning step by using this inference to go from the source node to the target node. When the target node was previously reasoned to from any other source node, that source node will be detached from the target node first.

Parameters:
  • seed (edxml.miner.Node) – Seed of the concept instance
  • confidence – New confidence of target node
compute_dijkstra_confidence(seed)

Returns the confidence of the target node for use as edge length when using Dijkstra’s algorithm for finding the shortest path to a given node.

Parameters:seed (edxml.miner.Node) – Concept seed
Returns:
Return type:float