edxml package

edxml.EDXMLBase module

This module contains generic (base)classes used throughout the SDK.

exception edxml.EDXMLBase.EDXMLError(message)

Bases: exceptions.Exception

Generic EDXML exception class

exception edxml.EDXMLBase.EDXMLProcessingInterrupted

Bases: exceptions.Exception

Exception for signaling that EDXML processing was aborted

class edxml.EDXMLBase.EDXMLBase

Bases: object

Base class for most SDK subclasses

Error(Message)

Raises EDXMLError.

Parameters:Message (str) – Error message
Warning(Message)

Prints a warning to sys.stderr.

Parameters:Message (str) – Warning message
GetWarningCount()

Returns the number of warnings generated

GetErrorCount()

Returns the number of errors generated

ValidateDataType(ObjectType, DataType)

Validate a data type.

Parameters:
  • ObjectType (str) – Name of the object type having specified data type
  • DataType (str) – EDXML data type

calls Error() when datatype is invalid.

ValidateObject(Value, ObjectTypeName, DataType, Regexp=None)

Validate an object value.

The Value argument can be a string, int, bool, Decimal, etc depending on the data type.

Parameters:
  • Value – Object value.
  • ObjectTypeName (str) – Object type.
  • DataType (str) – EDXML data type of object.
  • Regexp (str, optional) – Regular expression for checking Value.

calls Error() when value is invalid.

NormalizeObject(Value, DataType)

Normalize an object value to a unicode string

Prepares an object value for computing sticky hashes, by applying the normalization rules as outlined in the EDXML specification. It takes a string containing an object value as input and returns a normalized unicode string.

Parameters:
  • Value (str, unicode) – The input object value
  • DataType (str) – EDXML data type
Returns:

unicode. The normalized object value

calls Error() when value is invalid.

edxml.EDXMLEvent module

class edxml.EDXMLEvent.EDXMLEvent(Properties, EventTypeName=None, SourceUrl=None, Parents=None, Content=None)

Bases: _abcoll.MutableMapping

Class representing an EDXML event.

The event allows its properties to be accessed and set much like a dictionary:

Event[‘property-name’] = ‘value’

Note

Properties are lists of object values. On assignment, non-list values are automatically wrapped into lists.

__init__(Properties, EventTypeName=None, SourceUrl=None, Parents=None, Content=None)

Creates a new EDXML event. The Properties argument must be a dictionary mapping property names to object values. Object values may be single values or a list of multiple object values.

Parameters:
  • Properties (dict(str, list)) – Dictionary of properties
  • EventTypeName (str, optional) – Name of the event type
  • SourceUrl (str, optional) – Event source URL
  • Parents (list, optional) – List of parent hashes
  • Content (unicode, optional) – Event content
Returns:

EDXMLEvent

copy()

Returns a copy of the event.

Returns:EDXMLEvent
classmethod Create(Properties, EventTypeName=None, SourceUrl=None, Parents=None, Content=None)

Creates a new EDXML event.

Parameters:
  • Properties (dict(str,list)) – Dictionary of properties
  • EventTypeName (str, optional) – Name of the event type
  • SourceUrl (str, optional) – Event source URL
  • Parents (list, optional) – List of parent hashes
  • Content (unicode, optional) – Event content
Returns:

Return type:

EDXMLEvent

CopyPropertiesFrom(SourceEvent, PropertyMap)

Copies properties from another event, mapping property names according to specified mapping. The PropertyMap argument is a dictionary mapping property names from the source event to property names in the target event, which is the event that is used to call this method.

If multiple source properties map to the same target property, the objects of both properties will be combined in the target property.

Parameters:
  • SourceEvent (EDXMLEvent) –
  • PropertyMap (dict(str,str)) –
Returns:

Return type:

EDXMLEvent

MovePropertiesFrom(SourceEvent, PropertyMap)

Moves properties from another event, mapping property names according to specified mapping. The PropertyMap argument is a dictionary mapping property names from the source event to property names in the target event, which is the event that is used to call this method.

If multiple source properties map to the same target property, the objects of both properties will be combined in the target property.

Parameters:
  • SourceEvent (EDXMLEvent) –
  • PropertyMap (dict(str,str)) –
Returns:

Return type:

EDXMLEvent

SetType(EventTypeName)

Set the event type.

Parameters:EventTypeName (str) – Name of the event type
Returns:
Return type:EDXMLEvent
SetContent(Content)

Set the event content.

Parameters:Content (unicode) – Content string
Returns:
Return type:EDXMLEvent
SetSource(SourceUrl)

Set the event source.

Parameters:SourceUrl (str) – EDXML source URL
Returns:
Return type:EDXMLEvent
AddParent(ParentHash)

Add the specified sticky hash to the list of explicit event parents.

Parameters:ParentHash (str) – Sticky hash, as hexadecimal string
Returns:
Return type:EDXMLEvent

edxml.EDXMLDefinitions module

EDXMLDefinitions

This module contains the EDXMLDefinitions class, which manages information from EDXML <definitions> sections.

class edxml.EDXMLDefinitions.EDXMLDefinitions

Bases: edxml.EDXMLBase.EDXMLBase

Class for managing information from EDXML <definitions> sections.

This class is used for managing definitions of event types, object types and sources from EDXML files. It is used for storing parsed definitions, querying definitions, and merging definitions from various EDXML files. It can be used to store <definitions> sections from multiple EDXML streams in succession, which results in the definitions from all streams being merged together. During the merge, the definitions are automatically checked for compatibility with previously stored definitions. The edxml.EDXMLBase.EDXMLError exception is raised when problems are detected.

The class also offers methods to generate EDXML <definitions> sections from the stored definitions, or generate (partial) XSD and RelaxNG schemas which can be used for validation of EDXML files.

SourceIdDefined(SourceId)

Returns boolean indicating if given Source ID exists.

Parameters:SourceId (str) – EDXML Source Identifier
Returns:bool. Source ID exists (True) or not (False)
EventTypeDefined(EventTypeName)

Returns boolean indicating if given event type is defined.

Parameters:EventTypeName (str) – Name of event type
Returns:bool. Event type exists (True) or not (False)
PropertyDefined(EventTypeName, PropertyName)

Returns boolean indicating if given property is defined.

Parameters:
  • EventTypeName (str) – Event type name
  • PropertyName (str) – Property Name
Returns:

bool. Property exists (True) or not (False)

ObjectTypeDefined(ObjectTypeName)

Returns boolean indicating if given object type is defined.

Parameters:ObjectTypeName (str) – Object type name
Returns:bool. Object type exists (True) or not (False)
RelationDefined(EventTypeName, Property1Name, Property2Name)

Returns boolean indicating if given property relation is defined.

Parameters:
  • EventTypeName (str) – Event type name
  • Property1Name (str) – Name of event type property
  • Property2Name (str) – Name of event type property
Returns:

bool. Relation exists (True) or not (False)

GetRelationPredicates()

Returns list of known relation predicates.

Returns:list. List of predicates
EventTypeIsUnique(EventTypeName)

Returns a boolean indicating if given eventtype is unique or not.

Parameters:EventTypeName (str) – Name of event type
Returns:bool. Event type is unique (True) or not (False)
PropertyIsUnique(EventTypeName, PropertyName)

Returns a boolean indicating if given property is unique or not.

Parameters:
  • EventTypeName (str) – Name of event type
  • PropertyName (str) – Name of event property
Returns:

bool. Property is unique (True) or not (False)

GetUniqueProperties(EventTypeName)

Returns a list of names of unique properties

Parameters:EventTypeName (str) – Name of an event type
Returns:list. List of unique properties
GetMandatoryObjectProperties(EventTypeName)

Returns a list of names of properties which must have an object

Parameters:EventTypeName (str) – Name of an event type
Returns:list. List of mandatory properties
GetSingletonObjectProperties(EventTypeName)

Returns a list of names of properties which cannot have multiple objects

Parameters:EventTypeName (str) – Name of an event type
Returns:list. List of singleton properties
PropertyDefinesEntity(EventTypeName, PropertyName)

Returns boolean indicating if property of given event type is an entity identifier.

Parameters:
  • EventTypeName (str) – Name of event type
  • PropertyName (str) – Name of event property
Returns:

bool. Is entity identifier (True) or not (False)

PropertyInRelation(EventTypeName, PropertyName)

Returns a boolean indicating if given property of specified event type is involved in any defined property relation.

Parameters:
  • EventTypeName (str) – Name of event type
  • PropertyName (str) – Name of event property
Returns:

bool. Is part of relation definition (True) or not (False)

GetSourceURLs()

Returns an ordered list of all parsed source URLs. The order as they appeared in the EDXML stream is preserved.

Returns:list. List of EDXML source URLs
GetSourceIDs()

Returns a list of all known source ID

Returns:list. List of EDXML source IDs
GetSourceId(Url)

Returns the ID of event source having specified URL

Parameters:Url (str) – EDXML source URL
Returns:str. EDXML source ID
GetEventTypeNames()

Returns a list of all known event type names. The order as they appeared in the EDXML stream is preserved.

Returns:list. List of event type names
GetEventTypeAttributes(EventTypeName)

Returns a dictionary containing all attributes of requested event type.

Parameters:EventTypeName (str) – Name of an event type
Returns:dict. EDXML attributes
GetEventTypeParent(EventTypeName)

Returns a dictionary containing all attributes of the parent of requested eventtype. Returns empty dictionary when event type has no defined parent.

Parameters:EventTypeName (str) – Name of an event type
Returns:dict. EDXML attributes of parent
GetEventTypeParentMapping(EventTypeName)

Returns a dictionary containing all property names of the event type that map to a parent property. The value of each key corresponds to the name of the parent property that the child property maps to. Returns empty dictionary when event type has no defined parent.

Parameters:EventTypeName (str) – Name of an event type
Returns:dict. Child / Parent Property mapping
GetEventTypesHavingObjectType(ObjectTypeName)

Returns a list of event type names having specified object type.

Parameters:ObjectTypeName (str) – Name of an object type
Returns:list. List of event type names
GetEventTypeNamesInClass(ClassName)

Returns a list of event type names that belong to specified class.

Parameters:ClassName (str) – Name of an EDXML event type class
Returns:list. List of event type names
GetEventTypeNamesInClasses(ClassNames)

Returns a list of event type names that belong to specified list of classes.

Parameters:ClassNames (iterable) – Iterable yielding names of EDXML event type classes
Returns:list. List of event type names
GetObjectTypeAttributes(ObjectTypeName)

Returns a dictionary containing all attributes of specified object type.

Parameters:ObjectTypeName (str) – Name of an object type
Returns:dict. Dictionary of EDXML object type attributes
GetEventTypeProperties(EventTypeName)

Returns a list of all property names of given event type. The order as they appeared in the EDXML stream is preserved.

Parameters:EventTypeName (str) – Name of an event type
Returns:list. List of event type property names
GetEventTypePropertyRelations(EventTypeName)

Returns a list of all IDs of property relations in given event type. The order as they appeared in the EDXML stream is preserved.

Parameters:EventTypeName (str) – Name of an event type
Returns:list. List of property relation identifiers
GetPropertyRelationAttributes(EventTypeName, RelationId)

Returns a dictionary containing all attributes of requested relation.

The returned attributes are the attributes of the EDXML <relation> tag that corresponds to the specified relation identifier.

Parameters:
  • EventTypeName (str) – Name of an event type
  • RelationId (str) – Identifier of a property relation
Returns:

dict. Dictionary containing EDXML attributes of relation tag

GetObjectTypeNames()

Returns a list of all known object type names. The order as they appeared in the EDXML stream is preserved.

Returns:list. List of object type names
GetSourceURLProperties(Url)

Returns dictionary containing attributes of the source specified by given URL.

The returned dictionary contains the attributes of the EDXML <source> tag

Parameters:Url (str) – EDXML source URL
Returns:dict. Dictionary containing the attributes of the source tag.
GetSourceIdProperties(SourceId)

Returns dictionary containing attributes of the source specified by given Source ID.

Parameters:SourceId (str) – Source identifier
Returns:dict. Dictionary containing the attributes of the source tag.
ObjectTypeRequiresUnicode(ObjectTypeName)

Returns True when given string object type requires unicode characters, return False otherwise.

Parameters:ObjectTypeName (str) – Name of an object type
Returns:bool. Objects require unicode encoding (True) or not (False)
GetPropertyObjectType(EventTypeName, PropertyName)

Return the name of the object type of specified event property.

Parameters:
  • EventTypeName (str) – Name of event type
  • PropertyName (str) – Name of event property
Returns:

str. Object type name

GetPropertyAttributes(EventTypeName, PropertyName)

Return dictionary of attributes of specified event property.

Parameters:
  • EventTypeName (str) – Name of event type
  • PropertyName (str) – Name of event property
Returns:

dict. Dictionary containing the EDXML attributes of the <property> tag

GetObjectTypeDataType(ObjectTypeName)

Return the data type of given object type.

Parameters:ObjectTypeName (str) – Name of an object type
Returns:str. EDXML data type
AddEventType(EventTypeName, Attributes)

Add an event type to the collection of event type definitions. If an event type definition with the same name exists, it will be checked for consistency with the existing definition.

Parameters:
  • EventTypeName (str) – Name of event type
  • Attributes (dict) – Dictionary holding the attributes of the <eventtype> tag
SetEventTypeParent(EventTypeName, Attributes)

Configure a parent of specified event type.

Parameters:
  • EventTypeName (str) – Name of event type
  • Attributes (dict) – Dictionary holding the attributes of the <parent> tag.
AddProperty(EventTypeName, PropertyName, Attributes)

Add a property to the collection of property definitions. If a property definition with the same name exists, it will be checked for consistency with the existing definition.

Parameters:
  • EventTypeName (str) – Name of an event type
  • PropertyName (str) – Name of a property
  • Attributes (dict) – Dictionary holding the attributes of the <property> tag.
AddRelation(EventTypeName, Property1Name, Property2Name, Attributes)

Add a relation to the collection of relation definitions. If a relation definition with the same properties exists, it will be checked for consistency with the existing definition.

Parameters:
  • EventTypeName (str) – Name of the event type
  • Property1Name (str) – Name of property 1
  • Property2Name (str) – Name of property 2
  • Attributes (dict) – Dictionary holding the attributes of the <relation> tag.
AddObjectType(ObjectTypeName, Attributes, WarnNotUsed=True)

Add an object type to the collection of object type definitions. If an object type definition with the same name exists, it will be checked for consistency with the existing definition.

Parameters:
  • ObjectTypeName (str) – Name of event type
  • Attributes (str) – Dictionary holding the attributes of the <objecttype> tag.
  • WarnNotUsed (str, optional) – Generate a warning if no property uses the object type
AddSource(SourceUrl, Attributes)

Add a source to the collection of event source definitions. If a source definition with the same URL exists, it will be checked for consistency with the existing definition.

Parameters:
  • SourceUrl (str) – URL of event source
  • Attributes (str) – Dictionary holding the attributes of the <source> tag.
RemoveSource(SourceId)

Remove a source from the collection of event source definitions.

Parameters:SourceId (str) – EDXML source ID
CheckPropertyObjectTypes()

Checks if all object types that properties refer to are defined. Calls self.Error when a problem is detected.

CheckEventTypePropertyConsistency(EventTypeName, PropertyNames)

Check if specified list of property names is correct for the specified event type. Calls self.Error when a problem is detected.

Parameters:
  • EventTypeName (str) – Name of an event type
  • PropertyName (str) – Name of an event property
CheckEventTypeRelations(EventTypeName)

Check if the relation definitions for specified eventtype are correct. Calls self.Error when a problem is detected.

CheckEventTypeParents(EventTypeName)

Checks if parent definition of given event type is valid, if there is any parent definition.

Parameters:EventTypeName (str) – Name of an event type
EvaluateReporterString(EventTypeName, EventProperties, Short=False, Capitalize=True, Colorize=False)

Evaluates the short or long reporter string of an event type using specified property values, returning the result. The EventProperties argument is an associative array containing property names as keys and lists of object value strings as values.

By default, the long reporter string is evaluated, unless Short is set to True.

By default, we will try to capitalize the first letter of the resulting string, unless Capitalize is set to False.

Optionally, the output can be colorized. At his time this means that, when printed on the terminal, the objects in the evaluated string will be displayed using bold white characters.

Parameters:
  • EventTypeName (str) – Name of the event type
  • EventProperties (dict) – Property values
  • Short (bool) – use short or long reporter string
  • Capitalize (bool) – Capitalize output or not
  • Colorize (bool) – Colorize output or not
Returns:

Return type:

str

CheckReporterString(EventTypeName, String, PropertyNames, CheckCompleteness=False)

Checks if given event type reporter string makes sense. Optionally, it can also check if all given properties are present in the string.

Parameters:
  • EventTypeName (str) – Name of an event type
  • String (str) – The reporter string
  • PropertyNames (list) – List of property names belonging to the event type
  • CheckCompleteness (bool, optional) – Check if all properties are present in string
UniqueSourceIDs()

Source IDs are required to be unique only within a single EDXML file. When multiple EDXML files are parsed using the same EDXMLParser instance, it may happen that different sources have the same ID. This method changes the Source IDs of all known sources to be unique.

It returns a mapping that maps old Source ID into new Source ID.

Returns:dict. Source identifier mapping
MergeEvents(EventTypeName, EventObjectsA, EventObjectsB)

Merges the objects of an event ‘B’ with the objects of another event ‘A’. The arguments EventObjectsA and EventObjectsB should be dictionaries where the keys are property names and the values lists of object values.

The objects in EventObjectsA are updated using the objects from EventObjectsB. It returns True when EventObjectsA was modified, False otherwise.

Note that this method does NOT merge parent hashes, only the property objects.

Parameters:
  • EventTypeName (str) – Name of event type of the events
  • EventObjectsA (dict) – Objects of event A
  • EventObjectsB (dict) – Objects of event B
ComputeStickyHash(EventTypeName, EventObjects, EventContent)

Computes a sticky hash from given event. The EventObjects argument should be a list containing dictionaries representing the objects. The dictionaries should contain the property name stored under the ‘property’ key and the value stored under the ‘value’ key.

Parameters:
  • EventTypeName (str) – The name of the event type
  • EventObjects (list) – List of event objects
  • EventContent (str) – The content of the event

Note

The supplied object values must be normalized using edxml.EDXMLBase.EDXMLBase.NormalizeObject().

Returns:str. A hexadecimal string representation of the hash.
ComputeStickyHashV3(EventTypeName, SourceUrl, EventObjects, EventContent)

Computes a sticky hash from given event, using the hashing algorithm from EDXML specification version 3.x. The EventObjects argument should be a list containing dictionaries representing the objects. The dictionaries should contain the property name stored under the ‘property’ key and the value stored under the ‘value’ key.

Parameters:
  • EventTypeName (str) – The name of the event type
  • EventObjects (list) – List of event objects
  • EventContent (str) – The content of the event

Note

The supplied object values must be normalized using edxml.EDXMLBase.EDXMLBase.NormalizeObject().

Returns:str. A hexadecimal string representation of the hash.
GenerateEventTypeXML(EventTypeName, XMLGenerator)

Generates an EDXML fragment which defines specified eventtype. Can be useful for constructing new EDXML files based on existing event type definitions.

The XMLGenerator argument may be either a SAX XMLGenerator instance or a ElementTree SimpleXMLWriter instance.

Parameters:
  • EventTypeName (str) – Name of the event type
  • XMLGenerator (XMLGenerator,XMLWriter) – XMLGenerator / XMLWriter instance
GenerateEventPropertyXML(EventTypeName, PropertyName, XMLGenerator)

Generates an EDXML fragment which defines specified eventtype property. Can be useful for constructing new EDXML files based on existing event type definitions.

The XMLGenerator argument may be either a SAX XMLGenerator instance or a ElementTree SimpleXMLWriter instance.

Parameters:
  • EventTypeName (str) – Name of the event type
  • PropertyName (str) – Name of the property
  • XMLGenerator (XMLGenerator,XMLWriter) – XMLGenerator / XMLWriter instance
GeneratePropertyRelationsXML(EventTypeName, XMLGenerator)

Generates an EDXML fragment which defines all property relation of specified eventtype. Can be useful for constructing new EDXML files based on existing event type definitions.

The XMLGenerator argument may be either a SAX XMLGenerator instance or a ElementTree SimpleXMLWriter instance.

Parameters:
  • EventTypeName (str) – Name of the event type
  • XMLGenerator (XMLGenerator,XMLWriter) – XMLGenerator / XMLWriter instance
GeneratePropertyRelationXML(EventTypeName, RelationId, XMLGenerator)

Generates an EDXML fragment which defines specified property relation of specified eventtype. Can be useful for constructing new EDXML files based on existing event type definitions.

The XMLGenerator argument may be either a SAX XMLGenerator instance or a ElementTree SimpleXMLWriter instance.

Parameters:
  • EventTypeName (str) – Name of the event type
  • RelationId (str) – Identifier of a property relation
  • XMLGenerator (XMLGenerator,XMLWriter) – XMLGenerator / XMLWriter instance
GenerateObjectTypeXML(ObjectTypeName, XMLGenerator)

Generates an EDXML fragment which defines specified object type. Can be useful for constructing new EDXML files based on existing object type definitions.

The XMLGenerator argument may be either a SAX XMLGenerator instance or a ElementTree SimpleXMLWriter instance.

Parameters:
  • EventTypeName (str) – Name of the event type
  • XMLGenerator (XMLGenerator,XMLWriter) – XMLGenerator / XMLWriter instance
GenerateEventSourceXML(SourceUrl, XMLGenerator)

Generates an EDXML fragment which defines specified event source. Can be useful for constructing new EDXML files based on existing event source definitions.

The XMLGenerator argument may be either a SAX XMLGenerator instance or a ElementTree SimpleXMLWriter instance.

Parameters:
  • SourceUrl (str) – EDXML source URL
  • XMLGenerator (XMLGenerator,XMLWriter) – XMLGenerator / XMLWriter instance
GenerateXMLDefinitions(XMLGenerator, IncludeSources=True)

Generates a full EDXML <definitions> section, containing all known event types, event types and optionally sources.

The XMLGenerator argument may be either a SAX XMLGenerator instance or a ElementTree SimpleXMLWriter instance.

Parameters:
  • XMLGenerator (XMLGenerator,XMLWriter) – XMLGenerator / XMLWriter instance
  • IncludeSources (bool, optional) – Boolean, include source definitions yes or no
OpenXSD()

Start generating an XSD schema from stored definitions. Always call this before constructing a (partial) XSD schema.

CloseXSD()

Finalize generated XSD and return it as a string.

GenerateEventTypeXSD(EventTypeName)

Generates an XSD fragment related to the event type definition of specified event type. Can be useful for generating modular XSD schemas or constructing full EDXML validation schemas.

Make sure to call OpenXSD() first.

Parameters:EventTypeName (str) – Name of an event type
GenerateObjectTypeXSD(ObjectTypeName)

Generates an XSD fragment related to the object type definition of specified object type. Can be useful for generating modular XSD schemas or constructing full EDXML validation schemas.

Make sure to call OpenXSD() first.

Parameters:ObjectTypeName (str) – Name of an object type
GenerateFullXSD()

Generates an full XSD schema for EDXML files that contain all known definitions of event types, object types and sources.

Make sure to call OpenXSD() first.

OpenRelaxNG()

Start generating a RelaxNG schema from stored definitions. Always call this before constructing a (partial) RelaxNG schema.

CloseRelaxNG()

Finalize RelaxNG schema and return it as a string.

Returns:str. The RelaxNG schema
GenerateEventTypeRelaxNG(EventTypeName)

Generates a RelaxNG fragment related to the event type definition of specified event type. Can be useful for generating modular RelaxNG schemas or constructing full EDXML validation schemas.

Parameters:EventTypeName (str) – Name of an event type

Make sure to call OpenRelaxNG() first.

GenerateObjectTypeRelaxNG(ObjectTypeName)

Generates a RelaxNG fragment related to the object type definition of specified object type. Can be useful for generating modular RelaxNG schemas or constructing full EDXML validation schemas.

Make sure to call OpenRelaxNG() first.

Parameters:ObjectTypeName (str) – Name of an object type
GenerateEventRelaxNG(EventTypeName)

Generates a RelaxNG fragment related to the object type definition of specified object type. Can be useful for generating modular RelaxNG schemas or constructing full EDXML validation schemas.

Make sure to call OpenRelaxNG() first.

Parameters:EventTypeName (str) – Name of an event type
GenerateGenericSourcesRelaxNG()

Generates a RelaxNG fragment representing an event source. Can be useful for generating modular RelaxNG schemas or constructing full EDXML validation schemas.

Make sure to call OpenRelaxNG() first.

GenerateFullRelaxNG(EventRefs=None, EventTypeRefs=None, ObjectTypeRefs=None)

Generates a full RelaxNG schema, containing all known definitions of event types, object types and sources. You can optionally provide dictionaries which map event type names or object type names to URIs. In this case, the resulting schema will refer to these URIs in stead of generating the schema patterns in place. This might be useful if you have a central storage for event type definitions or object type definitions.

Make sure to call OpenRelaxNG() first.

Parameters:
  • EventRefs (dict, optional) – Dictionary containing URI of event schema for every event type name
  • EventTypeRefs (dict, optional) – Dictionary containing URI of event type schema for every event type name
  • ObjectTypeRefs (dict, optional) – Dictionary containing URI of object type schema for every object type name

edxml.EDXMLFilter module

EDXMLFilter

This module can be used to write EDXML filtering scripts, which can edit EDXML streams. All filtering classes are based on edxml.EDXMLParser, so you can conveniently use Definitions attribute of EDXMLParser to query details about all defined event types, object types, sources, and so on.

class edxml.EDXMLFilter.EDXMLStreamFilter(upstream, SkipEvents=False, Output=<open file '<stdout>', mode 'w'>)

Bases: edxml.EDXMLParser.EDXMLParser

Base class for implementing EDXML filters

This class inherits from EDXMLParser and causes the EDXML data to be passed through to STDOUT.

You can pass any file-like object using the Output parameter, which will be used to send the filtered data stream to. It defaults to sys.stdout (standard output).

Parameters:
  • upstream – XML source (SaxParser instance in most cases)
  • SkipEvents (bool, optional) – Set to True to parse only the definitions section
  • Output (bool, optional) – An optional file-like object, defaults to sys.stdout
SetOutputEnabled(YesOrNo)

This method implements a global switch to turn XML pass through on or off. You can use it to allow certain parts of EDXML files to pass through to STDOUT while other parts are filtered out.

Parameters:YesOrNo (bool) – Output enabled (True) or disabled (False)
class edxml.EDXMLFilter.EDXMLValidatingStreamFilter(upstream, SkipEvents=False, Output=<open file '<stdout>', mode 'w'>)

Bases: edxml.EDXMLParser.EDXMLValidatingParser

Base class for implementing EDXML filters

This class is identical to the EDXMLStreamFilter class, except that it fully validates each event that is output by the filter.

You can pass any file-like object using the Output parameter, which will be used to send the filtered data stream to. It defaults to sys.stdout (standard output).

Parameters:
  • upstream – XML source (SaxParser instance in most cases)
  • SkipEvents (bool, optional) – Set to True to parse only the definitions section
  • Output (bool, optional) – An optional file-like object, defaults to sys.stdout
SetOutputEnabled(YesOrNo)

This method implements a global switch to turn XML pass through on or off. You can use it to allow certain parts of EDXML files to pass through to STDOUT while other parts are filtered out.

Note that the output of the filter is validated, so be careful not to break the EDXML data while filtering it.

Parameters:YesOrNo (bool) – Output enabled (True) or disabled (False)
class edxml.EDXMLFilter.EDXMLObjectEditor(upstream, Output=<open file '<stdout>', mode 'w'>)

Bases: edxml.EDXMLFilter.EDXMLValidatingStreamFilter

This class implements an EDXML filter which can be used to edit objects in an EDXML stream. It offers the ProcessObject() method which can be overridden to implement your own object editing EDXML processor.

You can pass any file-like object using the Output parameter, which will be used to send the filtered data stream to. It defaults to sys.stdout (standard output).

Parameters:
  • upstream – XML source (SaxParser instance in most cases)
  • Output (optional) – A file-like object, defaults to sys.stdout
InsertObject(PropertyName, Value)

Insert a new object into the EDXML stream

This method can be called from implementations of EditObject() to add objects to the current event.

Parameters:
  • PropertyName (str) – Property of the new object
  • Value (str) – Value of the new object
EditObject(SourceId, EventTypeName, ObjectTypeName, attrs)

This method can be overridden to process single objects.

Implementations should return the new object attributes by means of an xml.sax.xmlreader.AttributesImpl object.

Parameters:
  • SourceId (str) – EDXML Source Identifier
  • EventTypeName (str) – Name of the event type of current event
  • ObjectTypeName (str) – Object type of the object
  • attrs (AttributesImpl) – XML attributes of the <object> tag
Returns:

AttributesImpl. Updated XML attributes of the <object> tag

class edxml.EDXMLFilter.EDXMLEventEditor(upstream, Output=<open file '<stdout>', mode 'w'>)

Bases: edxml.EDXMLFilter.EDXMLValidatingStreamFilter

This class implements an EDXML filter which can use to edit events in an EDXML stream. It offers the ProcessEvent() method which can be overridden to implement your own event editing EDXML processor.

You can pass any file-like object using the Output parameter, which will be used to send the filtered data stream to. It defaults to sys.stdout (standard output).

Parameters:
  • upstream – XML source (SaxParser instance in most cases)
  • Output (optional) – A file-like object, defaults to sys.stdout
DeleteEvent()

Delete an event while editing

Call this method from EditEvent() to delete the event in stead of just editing it.

EditEvent(SourceId, EventTypeName, EventObjects, EventContent, EventAttributes)

Modifies an event

This method can be overridden to process single events.

The EventObjects parameter is a list of dictionaries. Each dictionary represents one object, containing a ‘property’ key and a ‘value’ key.

Parameters:
  • SourceId (str) – EDXML source identifier
  • EventTypeName (str) – Name of the event type
  • EventObjects (list) – List of event objects
  • EventContent (str) – Event content string
  • EventAttributes (AttributesImpl) – Sax AttributesImpl object containing <event> tag attributes
Returns:

tuple. Modified copies of the EventObjects, EventContent and EventAttributes parameters, in that order.

edxml.EDXMLParser module

EDXMLParser

This module is used for parsing out information about eventtype, objecttype and source definitions from EDXML streams.

The classes contain a Definitions property which is in instance of the EDXMLDefinitions class. All parsed information from the EDXML header is stored there, and you can use it to query information about event types, object types, and so on.

Classes in this module:

EDXMLParser EDXMLValidatingParser

class edxml.EDXMLParser.EDXMLParser(upstream, SkipEvents=False)

Bases: edxml.EDXMLBase.EDXMLBase, xml.sax.saxutils.XMLFilterBase

The EDXMLParser class can be used as a content handler for Sax, and has several methods that can be overridden to implement custom EDXML processing scripts. It can optionally skip reading the event data itself if you are only interested in obtaining the definitions. In that case, it will abort XML processing by raising the edxml.EDXMLBase.EDXMLProcessingInterrupted exception, which you can catch and handle.

Parameters:
  • upstream – XML source (SaxParser instance in most cases)
  • SkipEvents (bool, optional) – Set to True to parse only the definitions section
Definitions

EDXMLDefinitionsedxml.EDXMLDefinitions.EDXMLDefinitions instance

DefinitionsXMLGenerator = None

EDXMLDefinitions instance

EndOfStream()

This method can be overridden to finish processing the event stream.

The parser will call this method when the end of the EDXML stream has been reached.

ProcessEvent(EventTypeName, SourceId, EventObjects, EventContent, Parents)

This method can be overridden to process events. The EventObjects parameter contains a list of dictionaries, one for each object. Each dictionary has two keys. The ‘property’ key contains the name of the property. The ‘value’ key contains the value.

Parameters:
  • EventTypeName (str) – The name of the event type
  • SourceId (str) – Event source identifier
  • EventObjects (list) – List of objects
  • EventContent (str) – String containing event content
  • Parents (list) – List of hashes of explicit parent events, as hexadecimal strings
ProcessObject(EventTypeName, ObjectProperty, ObjectValue)

This method can be overridden to process objects.

The method will be called by the parser after reading an object element.

Parameters:
  • EventTypeName (str) – The name of the event type
  • ObjectProperty (str) – The name of the object property
  • ObjectValue (str) – String containing object value
DefinitionsLoaded()

This method can be overridden to perform some action as soon as the definitions are read and parsed.

The parser will call it as soon as the <definitions> element has been fully read and parsed. From that moment on, all event type and object type definitions can be access through the Definitions attribute of the parser instance.

GetEventCount(EventTypeName=None)

Returns the number of events parsed.

When an event type is passed, only the number of events of this type is returned.

Parameters:EventTypeName (str, optional) – Name of an event type
Returns:int. The number of events parsed.
GetWarningCount()

Returns the number of warnings issued

Returns:int. The number of warnings issued.
GetErrorCount()

Returns the number of errors issued

Returns:int. The number of errors issued.
GetDefinitionsElementAsString()

Returns string representation of the <definitions> element

Should not be called until the definitions tag has been fully fed to the parser.

Returns:str. The XML string
class edxml.EDXMLParser.EDXMLValidatingParser(upstream, SkipEvents=False, ValidateObjects=True)

Bases: edxml.EDXMLParser.EDXMLParser

This class extends the functionality of EDXMLParser with thorough checking of the EDXML data. You can use the EDXMLValidatingParser class to parse EDXML data that you don’t trust. The class will call edxml.EDXMLBase.EDXMLError() when it finds problems in the data. Validation is implemented by overriding EDXMLParser.DefinitionsLoaded(), EDXMLParser.ProcessObject() and EDXMLParser.ProcessEvent().

Like edxml.EDXMLParser.EDXMLParser, it can optionally skip reading the event data itself if you are only interested in obtaining and validating the definitions. In that case, it will abort XML processing by raising the edxml.EDXMLBase.EDXMLProcessingInterrupted exception, which you can catch and handle.

Parameters:
  • upstream – XML source (SaxParser instance in most cases)
  • SkipEvents (bool, optional) – Set to True to parse only the definitions section
  • ValidateObjects (bool, optional) – Set to False to skip automatic object value validation
Definitions

EDXMLDefinitionsedxml.EDXMLDefinitions.EDXMLDefinitions instance

edxml.EDXMLWriter module

EDXMLWriter

This module contains the EDXMLWriter class, which is used to generate EDXML streams.

class edxml.EDXMLWriter.EDXMLWriter(Output, Validate=True, ValidateObjects=True)

Bases: edxml.EDXMLBase.EDXMLBase

Class for generating EDXML streams

The Output parameter is a file-like object that will be used to send the XML data to. This file-like object can be pretty much anything, as long as it has a write() method.

The optional Validate parameter controls if the generated EDXML stream should be autovalidated or not. Automatic validation is enabled by default. This parameter applies to all aspects of EDXML validation, except for object value validation, which is covered by the ValidateObjects parameter.

Enabling object value validation always results in full EDXML validation, regardless of the value of the Validate parameter.

Parameters:
  • Output (file) – File-like output object
  • Validate (bool, optional) – Enable output validation (True) or not (False)
  • ValidateObjects (bool, optional) – Enable object validation (True) or not (False)
AddXmlDefinitionsElement(XmlString)

Apart from programmatically adding to an EDXML stream, it is also possible to insert plain XML into the stream. Both methods result in full automatic validation of the EDXML stream.

Use this method to insert a full <definitions> element.

Parameters:XmlString (str) – String containing the <definitions> element
AddXmlEventTypeElement(XmlString)

Apart from programmatically adding to an EDXML stream, it is also possible to insert plain XML into the stream. Both methods result in full automatic validation of the EDXML stream.

Use this method to insert a full <eventtype> element.

Parameters:XmlString (str) – String containing the <eventtype> element
AddXmlObjectTypeTag(XmlString)

Apart from programmatically adding to an EDXML stream, it is also possible to insert plain XML into the stream. Both methods result in full automatic validation of the EDXML stream.

Use this method to insert an <objecttype> tag.

Parameters:XmlString (str) – String containing the <objecttype> tag
AddXmlPropertyTag(XmlString)

Apart from programmatically adding to an EDXML stream, it is also possible to insert plain XML into the stream. Both methods result in full automatic validation of the EDXML stream.

Use this method to insert a <property> tag.

Parameters:XmlString (str) – String containing the <property> tag
AddXmlRelationTag(XmlString)

Apart from programmatically adding to an EDXML stream, it is also possible to insert plain XML into the stream. Both methods result in full automatic validation of the EDXML stream.

Use this method to insert a <relation> tag.

Parameters:XmlString (str) – String containing the <relation> tag
AddXmlSourceTag(XmlString)

Apart from programmatically adding to an EDXML stream, it is also possible to insert plain XML into the stream. Both methods result in full automatic validation of the EDXML stream.

Use this method to insert a <source> tag.

Parameters:XmlString (str) – String containing the <source> tag
AddXmlEventTag(XmlString)

Apart from programmatically adding to an EDXML stream, it is also possible to insert plain XML into the stream. Both methods result in full automatic validation of the EDXML stream.

Use this method to insert a full <event> element.

Parameters:XmlString (str) – String containing the <event> element
OpenDefinitions()

Opens the <definitions> element

OpenEventDefinitions()

Opens the <eventtypes> element

OpenEventDefinition(Name, Description, ClassList, ReporterShort, ReporterLong, DisplayName='/')

Opens an event type definition.

Parameters:
  • Name (str) – Name of the eventtype
  • Description (str) – Description of the eventtype
  • ClassList (str) – String containing a comma seperated list of class names
  • ReporterShort (str) – Short reporter string. Please refer to the specification for details.
  • LongReporter (str) – Long reporter string. Please refer to the specification for details.
  • DisplayName (str,optional) – EDXML display-name attribute
AddEventTypeParent(EventTypeName, PropertyMapping, ParentDescription, SiblingsDescription)

Adds a parent to an event definition.

Parameters:
  • EventTypeName (str) – Name of the parent eventtype
  • PropertyMapping (str) – Value of the EDXML propertymap attribute
  • ParentDescription (str) – Value of the EDXML parent-description attribute
  • SiblingsDescription (str) – Value of the EDXML siblings-description attribute
OpenEventDefinitionProperties()

Opens a <properties> element for defining eventtype properties.

AddEventProperty(Name, ObjectTypeName, Description, DefinesEntity=False, EntityConfidence=0, Unique=False, Merge='drop', Similar=None)

Adds a property to an event definition.

Parameters:
  • Name (str) – Name of the property
  • ObjectTypeName (str) – Name of the object type
  • Description (str) – Description of the property
  • DefinesEntity (bool,optional) – Property is entity identifier or not
  • EntityConfidence (float,optional) – Floating point confidence
  • Unique (bool,optional) – Property is unique or not
  • Merge (str,optional) – Merge strategy (only for unique properties)
  • Similar (str,optional) – EDXML similar attribute value
CloseEventDefinitionProperties()

Closes a previously opened <properties> section

OpenEventDefinitionRelations()

Opens a <relations> section for defining property relations.

AddRelation(PropertyName1, PropertyName2, Type, Description, Confidence, Directed=True)

Adds a property relation to an event definition.

Parameters:
  • PropertyName1 (str) – Name of first property
  • PropertyName2 (str) – Name of second property
  • Type (str) – EDXML Relation type attribute
  • Description (str) – Relation description
  • Confidence (float) – Floating point confidence value
  • Directed (bool,optional) – Boolean indicating if relation is directed (True) or not (False)
CloseEventDefinitionRelations()

Closes a previously opened <relations> section

CloseEventDefinition()

Closes a previously opened event definition

CloseEventDefinitions()

Closes a previously opened <eventtypes> section

OpenObjectTypes()

Opens a <objecttypes> section for defining object types.

AddObjectType(Name, Description, ObjectDataType, FuzzyMatching='none', DisplayName='/', Compress=False, ENP=0, Regexp='[\\s\\S]*')

Adds a object type definition.

Parameters:
  • Name (str) – Name of object type
  • Description (str) – Description of object type
  • ObjectDataType (str) – EDXML Data type
  • FuzzyMatching (str,optional) – EDXML fuzzy-matching attribute
  • DisplayName (str) – Display name
  • Compress (bool,optional) – Use data compression (True) or not (False)
  • ENP (int,optional) – EDXML enp attribute
  • Regexp (str,optional) – EDXML regexp attribute
CloseObjectTypes()

Closes a previously opened <objecttypes> section

OpenSourceDefinitions()

Opens a <sources> section for defining event sources.

AddSource(SourceId, URL, DateAcquired, Description)

Adds a source definition.

Parameters:
  • SourceId (str) – EDXML source ID
  • URL (str) – Source URL
  • DateAcquired (str) – Acquisition date (yyyymmdd)
  • Description (str) – Description of the source
CloseSourceDefinitions()

Closes a previously opened <sources> section

CloseDefinitions()

Closes the <definitions> section

OpenEventGroups()

Opens the <eventgroups> section, containing all eventgroups

OpenEventGroup(EventTypeName, SourceId)

Opens an event group.

Parameters:
  • EventTypeName (str) – Name of the eventtype
  • SourceId (str) – Source Id
CloseEventGroup()

Closes a previously opened event group

AddEvent(PropertyObjects, Content='', ParentHashes=[], IgnoreInvalidObjects=False)

Alternative method for adding an event

This method expects a dictionary containing a list of object values for every property.

The optional ParentHashes parameter may contain a list of sticky hashes of explicit parent events, in hexadecimal string representation.

if IgnoreInvalidObjects is set to True, any errors thrown by the validator as a result of invalid object values will be ignored, and the object will not be included in the event.

Parameters:
  • PropertyObjects (dict) – Object dictionary
  • Content (str,optional) – Event content
  • ParentHashes (list,optional) – List of explicit parent events
  • IgnoreInvalidObjects (bool,optional) – Option to ignore invalid object values
OpenEvent(ParentHashes=[])

Opens an event.

The optional ParentHashes parameter may contain a list of sticky hashes of explicit parent events, in hexadecimal string representation.

Parameters:ParentHashes (list,optional) – List of explicit parent events
AddObject(PropertyName, Value, IgnoreInvalid=False)

Adds an object to previously opened event.

if IgnoreInvalidObjects is set to True, any errors thrown by the validator as a result of invalid object values will be ignored, and the object will not be included in the event.

Parameters:
  • PropertyName (str) – Name of object property
  • Value – Object value, can be any object that can be converted to unicode.
  • IgnoreInvalid (bool,optional) – Generate a warning in stead of an error for invalid values
AddContent(ContentString)

Adds plain text content to previously opened event.

Parameters:ContentString (str) – Event content
AddTranslation(Language, Interpreter, TranslationString)

Adds translated content to previously opened event.

Parameters:
  • Language (str) – ISO 639-1 language code
  • Interpreter (str) – Name of interpreter
  • TranslationString (str) – The translation
CloseEvent()

Closes a previously opened event

CloseEventGroups()

Closes a previously opened <eventgroups> section

edxml.SimpleEDXMLWriter module

class edxml.SimpleEDXMLWriter.SimpleEDXMLWriter(Output, Validate=True, ValidateObjects=True)

Bases: object

High level EDXML stream writer

This class offers a simplified interface to the EDXMLWriter class. Apart from a simplified interface, it implements some additional features like buffering, post-processing, automatic merging of output events and latency control.

__init__(Output, Validate=True, ValidateObjects=True)

Create a new SimpleEDXMLWriter, outputting an EDXML stream to specified output.

By default, the output will be fully validated. Optionally, validating the event objects can be disabled, or output validation can be completely disabled by setting Validate to True. This may be used to boost performance in case you know that the data will be validated at the receiving end, or in case you know that your generator is perfect. :)

The Output parameter is a file-like object that will be used to send the XML data to. This file-like object can be pretty much anything, as long as it has a write() method.

Parameters:
  • Output (file) – a file-like object
  • Validate (bool`, optional) – Validate the output (True) or not (False)
  • ValidateObjects (bool`, optional) – Validate event objects (True) or not (False)
Returns:

Return type:

SimpleEDXMLWriter

RegisterEventPostProcessor(EventTypeName, Callback)

Register a post-processor for events of specified type. Whenever an event is submitted through the AddEvent() method, the supplied callback method will be invoked before the event is output. The callback must have the the same call signature as the AddEvent() method. The two optional arguments Type and Source will always be specified when the callback is invoked. The callback should not return anything.

Apart from generating events, callbacks can also modify the event that is about to be outputted, by editing its call arguments.

Parameters:
  • EventTypeName (str) – Name of the event type
  • Callback (callable) – The callback
Returns:

The SimpleEDXMLWriter instance

Return type:

SimpleEDXMLWriter

IgnoreInvalidObjects()

Instructs the EDXML writer to ignore invalid object values. After calling this method, any event value that fails to validate will be silently dropped.

Note

Dropping object values may lead to invalid events.

Note

This has no effect when object validation is disabled.

Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
IgnoreInvalidEvents(Warn=False)

Instructs the EDXML writer to ignore invalid events. After calling this method, any event that fails to validate will be dropped. If Warn is set to True, a detailed warning will be printed, allowing the source and cause of the problem to be determined.

Note

This also implies that invalid objects will be ignored.

Note

This has no effect when event validation is disabled.

Parameters:Warn (bool`, optional) – Print warnings or not
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
AutoMerge(EventTypeName)

Enable auto-merging for events of specified event type. Auto-merging implies that colliding output events will be merged before outputting them. This may be useful to reduce the event output rate when generating large numbers of colliding events.

Parameters:EventTypeName (str) – The name of the event type
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
AddEventType(Type)

Add specified event type to the output stream.

Parameters:Type (EventType) – EventType instance
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
AddObjectType(Type)

Add specified object type to the output stream.

Parameters:Type (ObjectType) – ObjectType instance
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
AddEventSource(Source)

Add specified event source to the output stream.

Parameters:Source (EventSource) – EventSource instance
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
SetEventType(EventTypeName)

Set the default output event type. If no explicit event type is used in calls to AddEvent(), the default event type will be used.

Parameters:EventTypeName (str) – The event type name
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
SetEventSource(SourceId)

Set the default event source for the output events. If no explicit source is specified in calls to AddEvent(), the default source will be used.

Parameters:SourceId (str) – The event source identifier
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
AddEvent(Event)

Add the specified event to the output stream. If the event type or event source are not specified, the default type and source that have been set using SetEventType() and SetEventSource() will be used.

Parameters:Event (EDXMLEvent) – An EDXMLEvent instance
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
GenerateEvent(Properties, Content='', Parents=None, Type=None, Source=None)

Generate a new event and write to the output stream. If the event type or event source are not specified, the default type and source that have been set using SetEventType() and SetEventSource() will be used.

The Properties dictionary must have keys containing the property names. The values of the dictionary must be object values or lists of object values. Object values can be anything can be cast to a unicode object.

Parameters:
  • Properties (dict[str]) – The event properties
  • Content (str) – Event content string
  • Parents (list[str], Optional) – List of sticky hashes, as hex strings
  • Type (str, Optional) – Event type name
  • Source (str, Optional) – Source identifier
Returns:

The SimpleEDXMLWriter instance

Return type:

SimpleEDXMLWriter

SetBufferSize(EventCount)

Sets the buffer size for writing events to the output. The default buffer size is 1024 events.

Parameters:EventCount (int) – Maximum number of events
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
SetOutputLatency(Latency)

Sets the output latency, in seconds. Setting this value to a positive value forces the writer to flush its buffers at least once every time the latency time expires. The default latency is zero, which means that output will be silent for as long as it takes to fill the input buffer.

Parameters:Latency (float) – Maximum output latency (seconds)
Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter
Close()

Finalizes the output stream generation process. This method must be called to yield a complete, valid output stream.

Returns:The SimpleEDXMLWriter instance
Return type:SimpleEDXMLWriter

edxml.ontology module

This sub-package contains classes that represent EDXML ontology elements, like event types, object types, event sources, and so on.

class edxml.ontology.ObjectType(Name, DisplayName, Description=None, DataType='string:0:cs:u', Enp=0, Compress=False, FuzzyMatching='none', Regexp='[\s\S]*')

Bases: object

Class representing an EDXML object type

classmethod Create(Name, DisplayNameSingular=None, DisplayNamePlural=None, Description=None, DataType='string:0:cs:u')

Creates and returns a new ObjectType instance. When no display names are specified, display names will be created from the object type name. If only a singular form is specified, the plural form will be auto-generated by appending an ‘s’.

Parameters:
  • Name (str) – object type name
  • DisplayNameSingular (str) – display name (singular form)
  • DisplayNamePlural (str) – display name (plural form)
  • Description (str) – short description of the object type
  • DataType (str) – a valid EDXML data type
Returns:

The ObjectType instance

Return type:

ObjectType

GetName()

Returns the name of the object type.

Returns:The object type name
Return type:str
GetDisplayNameSingular()

Returns the display name of the object type, in singular form.

Returns:
Return type:str
GetDisplayNamePlural()

Returns the display name of the object type, in plural form.

Returns:
Return type:str
GetDescription()

Returns the description of the object type.

Returns:
Return type:str
GetDataType()

Returns the data type of the object type.

Returns:
Return type:str
GetEntityNamingPriority()

Returns entity naming priority of the object type.

Returns:
Return type:int
IsCompressible()

Returns True if compression is advised for the object type, returns False otherwise.

Returns:
Return type:bool
GetFuzzyMatching()

Returns the EDXML fuzzy-matching attribute for the object type.

Returns:
Return type:str
GetRegexp()

Returns the regular expression that object values must match.

Returns:
Return type:str
SetDescription(Description)

Sets the object type description

Parameters:Description (str) – Description
Returns:The ObjectType instance
Return type:EventSource
SetDataType(DataType)

Configure the data type.

Parameters:DataType (DataType) – DataType instance
Returns:The ObjectType instance
Return type:ObjectType
SetDisplayName(Singular, Plural=None)

Configure the display name. If the plural form is omitted, it will be auto-generated by appending an ‘s’ to the singular form.

Parameters:
  • Singular (str) – display name (singular form)
  • Plural (str) – display name (plural form)
Returns:

The ObjectType instance

Return type:

ObjectType

SetEntityNamingPriority(Priority)

Configure the entity naming priority of the object type.

Parameters:Priority (int) – The EDXML priority attribute
Returns:The ObjectType instance
Return type:ObjectType
SetRegexp(Pattern)

Configure a regular expression that object values must match.

Parameters:Pattern (str) – Regular expression
Returns:The ObjectType instance
Return type:ObjectType
FuzzyMatchHead(Length)

Configure fuzzy matching on the head of the string (only for string data types).

Parameters:Length (int) – Number of characters to match
Returns:The ObjectType instance
Return type:ObjectType
FuzzyMatchTail(Length)

Configure fuzzy matching on the tail of the string (only for string data types).

Parameters:Length (int) – Number of characters to match
Returns:The ObjectType instance
Return type:ObjectType
FuzzyMatchSubstring(Pattern)

Configure fuzzy matching on a substring (only for string data types).

Parameters:Pattern (str) – Regular expression
Returns:The ObjectType instance
Return type:ObjectType
FuzzyMatchPhonetic()

Configure fuzzy matching on the sound of the string (phonetic fingerprinting).

Returns:The ObjectType instance
Return type:ObjectType
Compress()

Enable compression for the object type.

Returns:The ObjectType instance
Return type:ObjectType
Write(Writer)

Write the object type definition into the provided EDXMLWriter instance.

Parameters:Writer (EDXMLWriter) – An EDXMLWriter instance
Returns:The ObjectType instance
Return type:ObjectType
class edxml.ontology.DataType(data_type)

Bases: object

Class representing an EDXML data type

classmethod Timestamp()

Create a timestamp DataType instance.

Returns:
Return type:DataType
classmethod Boolean()

Create a boolean value DataType instance.

Returns:
Return type:DataType
classmethod TinyInt(Signed=True)

Create an 8-bit tinyint DataType instance.

Parameters:Signed (bool) – Create signed or unsigned number
Returns:
Return type:DataType
classmethod SmallInt(Signed=True)

Create a 16-bit smallint DataType instance.

Parameters:Signed (bool) – Create signed or unsigned number
Returns:
Return type:DataType
classmethod MediumInt(Signed=True)

Create a 24-bit mediumint DataType instance.

Parameters:Signed (bool) – Create signed or unsigned number
Returns:
Return type:DataType
classmethod Int(Signed=True)

Create a 32-bit int DataType instance.

Parameters:Signed (bool) – Create signed or unsigned number
Returns:
Return type:DataType
classmethod BigInt(Signed=True)

Create a 64-bit bigint DataType instance.

Parameters:Signed (bool) – Create signed or unsigned number
Returns:
Return type:DataType
classmethod Float(Signed=True)

Create a 32-bit float DataType instance.

Parameters:Signed (bool) – Create signed or unsigned number
Returns:
Return type:DataType
classmethod Double(Signed=True)

Create a 64-bit double DataType instance.

Parameters:Signed (bool) – Create signed or unsigned number
Returns:
Return type:DataType
classmethod Decimal(TotalDigits, FractionalDigits, Signed=True)

Create a decimal DataType instance.

Parameters:
  • TotalDigits (int) – Total number of digits
  • FractionalDigits (int) – Number of digits after the decimal point
  • Signed (bool) – Create signed or unsigned number
Returns:

Return type:

DataType

classmethod String(Length=0, CaseSensitive=True, RequireUnicode=True, ReverseStorage=False)

Create a string DataType instance.

Parameters:
  • Length (int) – Max number of characters (zero = unlimited)
  • CaseSensitive (bool) – Treat strings as case insensitive
  • RequireUnicode (bool) – String may contain UTF-8 characters
  • ReverseStorage (bool) – Hint storing the string in reverse character order
Returns:

Return type:

DataType

classmethod Enum(*Choices)

Create an enumeration DataType instance.

Parameters:*Choices (str) – Possible string values
Returns:
Return type:DataType
classmethod Hexadecimal(Length, Separator=None, GroupSize=None)

Create a hexadecimal number DataType instance.

Parameters:
  • Length (int) – Number of hex digits
  • Separator (str) – Separator character
  • GroupSize (int) – Number of hex digits per group
Returns:

Return type:

DataType

classmethod GeoPoint()

Create a geographical location DataType instance.

Returns:
Return type:DataType

Create a hashlink DataType instance.

Returns:
Return type:DataType
classmethod Ipv4()

Create an IPv4 DataType instance

Returns:
Return type:DataType
Get()

Returns the EDXML data-type attribute.

Returns:
Return type:str
GetFamily()

Returns the data type family.

Returns:
Return type:str
class edxml.ontology.EventProperty(Name, ObjectTypeName, Description=None, DefinesEntity=False, EntityConfidence=0, Unique=False, Merge='drop', Similar='')

Bases: object

Class representing an EDXML event property

MERGE_MATCH = 'match'

Merge strategy ‘match’

MERGE_DROP = 'drop'

Merge strategy ‘drop’

MERGE_ADD = 'add'

Merge strategy ‘add’

MERGE_REPLACE = 'replace'

Merge strategy ‘replace’

MERGE_INC = 'increment'

Merge strategy ‘increment’

MERGE_SUM = 'sum'

Merge strategy ‘sum’

MERGE_MULTIPLY = 'multiply'

Merge strategy ‘multiply’

MERGE_MIN = 'min'

Merge strategy ‘min’

MERGE_MAX = 'max'

Merge strategy ‘max’

classmethod Create(Name, ObjectTypeName, Description=None)

Create a new event property.

Note

The description should be really short, indicating which role the object has in the event type.

Parameters:
  • Name (str) – Property name
  • ObjectTypeName (str) – Name of the object type
  • Description (str) – Property description
Returns:

The EventProperty instance

Return type:

EventProperty

GetName()

Returns the property name.

Returns:
Return type:str
GetDescription()

Returns the property description.

Returns:
Return type:str
GetObjectTypeName()

Returns the name of the associated object type.

Returns:
Return type:str
GetMergeStrategy()

Returns the merge strategy.

Returns:
Return type:str
GetEntityConfidence()

Returns the entity identification confidence.

Returns:
Return type:float
GetSimilarHint()

Get the EDXML ‘similar’ attribute.

Returns:
Return type:str
SetMergeStrategy(MergeStrategy)

Set the merge strategy of the property. This should be one of the MERGE_* attributes of this class.

Parameters:MergeStrategy (str) – The merge strategy
Returns:The EventProperty instance
Return type:EventProperty
SetDescription(Description)

Set the description of the property. This should be really short, indicating which role the object has in the event type.

Parameters:Description (str) – The property description
Returns:The EventProperty instance
Return type:EventProperty
Unique()

Mark property as a unique property, which also sets the merge strategy to ‘match’.

Returns:The EventProperty instance
Return type:EventProperty
IsUnique()

Returns True if property is unique, returns False otherwise

Returns:
Return type:bool
Entity(Confidence)

Marks the property as an entity identifying property, with specified confidence.

Parameters:Confidence (float) – entity identification confidence [0.0, 1.0]
Returns:The EventProperty instance
Return type:EventProperty
IsEntity()

Returns True if property is an entity identifying property, returns False otherwise.

Returns:
Return type:bool
HintSimilar(Similarity)

Set the EDXML ‘similar’ attribute.

Parameters:Similarity (str) – similar attribute string
Returns:The EventProperty instance
Return type:EventProperty
MergeAdd()

Set merge strategy to ‘add’.

Returns:The EventProperty instance
Return type:EventProperty
MergeReplace()

Set merge strategy to ‘replace’.

Returns:The EventProperty instance
Return type:EventProperty
MergeDrop()

Set merge strategy to ‘drop’, which is the default merge strategy.

Returns:The EventProperty instance
Return type:EventProperty
MergeMin()

Set merge strategy to ‘min’.

Returns:The EventProperty instance
Return type:EventProperty
MergeMax()

Set merge strategy to ‘max’.

Returns:The EventProperty instance
Return type:EventProperty
MergeIncrement()

Set merge strategy to ‘increment’.

Returns:The EventProperty instance
Return type:EventProperty
MergeSum()

Set merge strategy to ‘sum’.

Returns:The EventProperty instance
Return type:EventProperty
MergeMultiply()

Set merge strategy to ‘multiply’.

Returns:The EventProperty instance
Return type:EventProperty
Write(Writer)

Writes the property into the provided EDXMLWriter instance.

Parameters:Writer (EDXMLWriter) – EDXMLWriter instance
Returns:The EventProperty instance
Return type:EventProperty
class edxml.ontology.PropertyRelation(Source, Dest, Description, TypeClass, TypePredicate, Confidence=1.0, Directed=True)

Bases: object

Class representing a relation between two EDXML properties

classmethod Create(Source, Dest, Description, TypeClass, TypePredicate, Confidence=1.0, Directed=True)

Create a new property relation

Parameters:
  • Source (str) – Name of source property
  • Dest (str) – Name of destination property
  • Description (str) – Relation description, with property placeholders
  • TypeClass (str) – Relation type class (‘inter’, ‘intra’ or ‘other’)
  • TypePredicate (str) – free form predicate
  • Confidence (float) – Relation confidence [0.0,1.0]
  • Directed (bool) – Directed relation True / False
Returns:

Return type:

PropertyRelation

GetSource()

Returns the name of the source property.

Returns:
Return type:str
GetDest()

Returns the name of the destination property.

Returns:
Return type:str
GetDescription()

Returns the relation description.

Returns:
Return type:str
GetType()

Returns the relation type.

Returns:
Return type:str
GetTypeClass()

Returns the class part of the relation type.

Returns:
Return type:str
GetTypePredicate()

Returns the predicate part of the relation type.

Returns:
Return type:str
GetConfidence()

Returns the relation confidence.

Returns:
Return type:float
IsDirected()

Returns True when the relation is directed, returns False otherwise.

Returns:
Return type:bool
SetConfidence(Confidence)

Configure the relation confidence

Parameters:Confidence (float) – Relation confidence [0.0,1.0]
Returns:The PropertyRelation instance
Return type:PropertyRelation
Directed()

Marks the property relation as directed

Returns:The PropertyRelation instance
Return type:PropertyRelation
Undirected()

Marks the property relation as undirected

Returns:The PropertyRelation instance
Return type:PropertyRelation
Write(Writer)

Writes the property relation into the provided EDXMLWriter instance

Parameters:Writer (EDXMLWriter) – EDXMLWriter instance
Returns:The PropertyRelation instance
Return type:PropertyRelation
class edxml.ontology.EventType(Name, DisplayName=None, Description=None, ClassList='', ReporterShort='no description available', ReporterLong='no description available', Parent=None)

Bases: object

Class representing an EDXML event type

classmethod Create(Name, DisplayNameSingular=None, DisplayNamePlural=None, Description=None)

Creates and returns a new EventType instance. When no display names are specified, display names will be created from the event type name. If only a singular form is specified, the plural form will be auto-generated by appending an ‘s’.

Parameters:
  • Name (str) – Event type name
  • DisplayNameSingular (str) – Display name (singular form)
  • DisplayNamePlural (str) – Display name (plural form)
  • Description (str) – Event type description
Returns:

The EventType instance

Return type:

EventType

GetName()

Returns the event type name

Returns:
Return type:str
GetDisplayNameSingular()

Returns the event type display name, in singular form.

Returns:
Return type:str
GetDisplayNamePlural()

Returns the event type display name, in plural form.

Returns:
Return type:str
GetClasses()

Returns the list of classes that this event type belongs to.

Returns:
Return type:list[str]
GetProperty(PropertyName)

Returns the property instance of the event type property having specified name.

Returns:The EventProperty instance
Return type:EventProperty
GetProperties()

Returns a dictionary containing all properties of the event type. The keys in the dictionary are the property names, the values are the EDXMLProperty instances.

Returns:Properties
Return type:dict[str,EventProperty]
HasClass(ClassName)

Returns True if specified class is in the list of classes that this event type belongs to, return False otherwise.

Parameters:ClassName (str) – The class name
Returns:
Return type:bool
GetReporterShort()

Returns the short reporter string.

Returns:
Return type:str
GetReporterLong()

Returns the long reporter string.

Returns:
Return type:str
AddProperty(Property)

Add specified property

Parameters:Property (EventProperty) – EventProperty instance
Returns:The EventType instance
Return type:EventType
AddRelation(Relation)

Add specified property relation

Parameters:Relation (PropertyRelation) – Property relation
Returns:The EventType instance
Return type:EventType
SetDescription(Description)

Sets the event type description

Parameters:Description (str) – Description
Returns:The EventType instance
Return type:EventType
SetParent(Parent)

Set the parent event type

Parameters:Parent (EventTypeParent) – Parent event type
Returns:The EventType instance
Return type:EventType
AddClass(ClassName)

Adds the specified event type class

Parameters:ClassName (str) –
Returns:The EventType instance
Return type:EventType
SetName(EventTypeName)

Sets the name of the event type.

Parameters:EventTypeName (str) – Event type name
Returns:The EventType instance
Return type:EventType
SetDisplayName(Singular, Plural=None)

Configure the display name. If the plural form is omitted, it will be auto-generated by appending an ‘s’ to the singular form.

Parameters:
  • Singular (str) – Singular display name
  • Plural (str) – Plural display name
Returns:

The EventType instance

Return type:

EventType

SetReporterShort(Reporter)

Set the short reporter string

Parameters:Reporter (str) – The short reporter string
Returns:The EventType instance
Return type:EventType
SetReporterLong(Reporter)

Set the long reporter string

Parameters:Reporter (str) – The long reporter string
Returns:The EventType instance
Return type:EventType
Write(Writer)

Writes the event type into the provided EDXMLWriter instance

Parameters:Writer (EDXMLWriter) – EDXMLWriter instance
Returns:The EventType instance
Return type:EventType
class edxml.ontology.EventTypeParent(ParentEventTypeName, PropertyMap, ParentDescription=None, SiblingsDescription=None)

Bases: object

Class representing an EDXML event type parent

classmethod Create(ParentEventTypeName, PropertyMap, ParentDescription=None, SiblingsDescription=None)

Creates a new event type parent. The PropertyMap argument is a dictionary mapping property names of the child event type to property names of the parent event type.

If no ParentDescription is specified, it will be set to ‘belonging to’. If no SiblingsDescription is specified, it will be set to ‘sharing’.

Note

All unique properties of the parent event type must appear in the property map.

Note

The parent event type must be defined in the same EDXML stream as the child.

Parameters:
  • ParentEventTypeName (str) – Name of the parent event type
  • PropertyMap (dict[str, str]) – Property map
  • ParentDescription (str, Optional) – The EDXML parent-description attribute
  • SiblingsDescription (str, Optional) – The EDXML siblings-description attribute
Returns:

The EventTypeParent instance

Return type:

EventTypeParent

SetParentDescription(Description)

Sets the EDXML parent-description attribute

Parameters:Description (str) – The EDXML parent-description attribute
Returns:The EventTypeParent instance
Return type:EventTypeParent
SetSiblingsDescription(Description)

Sets the EDXML siblings-description attribute

Parameters:Description (str) – The EDXML siblings-description attribute
Returns:The EventTypeParent instance
Return type:EventTypeParent
GetEventType()

Returns the name of the parent event type.

Returns:
Return type:str
GetPropertyMap()

Returns the property map as a dictionary mapping property names of the child event type to property names of the parent.

Returns:
Return type:dict[str,str]
GetParentDescription()

Returns the EDXML ‘parent-description’ attribute.

Returns:
Return type:str
GetSiblingsDescription()

Returns the EDXML ‘siblings-description’ attribute.

Returns:
Return type:str
Write(Writer)

Writes the parent into the provided EDXMLWriter instance.

Parameters:Writer (EDXMLWriter) – An EDXMLWriter instance
Returns:The EventTypeParent instance
Return type:EventTypeParent
class edxml.ontology.EventSource(Id, Url, Description=None, AcquisitionDate=None)

Bases: object

Class representing an EDXML event source

classmethod Create(Url, Description=None, AcquisitionDate=None)

Creates a new event source definition. If no acquisition date is specified, it will be assumed that the acquisition date is today.

Note

Choose your source URLs wisely. The source URLs are used in sticky hash computations, so changing the URL may have quite a few consequences if the hash is referred to anywhere. Also, pay attention to the URL in the context of URLs generated by other EDXML data sources, to obtain a consistent, well structured source URL tree.

Parameters:
  • Url (str) – The source URL
  • Description (str) – Description of the source
  • AcquisitionDate (str) – Acquisition date in format yyyymmdd
Returns:

Return type:

EventSource

GetId()

Returns the source Id

Returns:
Return type:str
GetUrl()

Returns the source URL

Returns:
Return type:str
GetAcquisitionDateString()

Returns the acquisition date

Returns:The date in yyyymmdd format
Return type:str
SetDescription(Description)

Sets the source description

Parameters:Description (str) – Description
Returns:The EventSource instance
Return type:EventSource
Write(Writer)

Writes the event source into the provided EDXMLWriter instance

Parameters:Writer (EDXMLWriter) – EDXMLWriter instance
Returns:The EventSource instance
Return type:EventSource

edxml.transcode module

This sub-package contains several classes to ease development of transcoders that convert various types of input data (like JSON records) into EDXML output streams.

edxml.transcode.json module

This sub-package implements a transcoder to convert JSON records into EDXML output streams. The various classes in this package can be extended to implement transcoders for specific types of JSON records and route JSON records to the correct transcoder.

class edxml.transcode.json.JsonTranscoder

Bases: edxml.EDXMLBase.EDXMLBase

This is a base class that can be extended to implement transcoders for various JSON record types. The class features a number of constants containing information about the types of EDXML events that it will produce, which JSON record types result in which EDXML event types, and so on. Extensions can override these constants, allowing the GenerateEventTypes() method to use the information in the constants to generate basic EDXML event type definitions.

Except for the TYPES constant, all constants are optional. They just provide a means for developers to replace code with constants, improving the readability of the transcoder.

TYPES = []

The TYPES attribute must contain a list of EDXML event type names of the event types that will be generated by the transcoder.

Note

Overriding and populating this attribute is mandatory.

TYPE_MAP = {}

The TYPE_MAP attribute is a dictionary mapping EDXML event type names to the corresponding JSON record type names.

Note

When no EDXML event type name is specified for a particular JSON record type, it is up to the transcoder to set the event type.

Note

The fallback transcoder must set the None key to the name of the EDXML fallback event type.

TYPE_DESCRIPTIONS = {}

The TYPE_DESCRIPTIONS attribute is a dictionary mapping EDXML event type names to event type descriptions.

TYPE_DISPLAY_NAMES = {}

The TYPE_DISPLAY_NAMES attribute is a dictionary mapping EDXML event type names to event type display names. Each display name is a list, containing the singular form, optionally followed by the plural form, like this:

{‘event-type-name’: [‘event’, ‘events’]}

The plural form may be omitted. In that case, the plural form will be assumed to be the singular form with an additional ‘s’ appended.

ATTRIBUTE_MAP = {}

The ATTRIBUTE_MAP attribute is a dictionary mapping JSON attributes to EDXML event properties. The map is used to automatically populate the properties of the EDXMLEvent instances produced by the Generate method of the JsonTranscoder class. The keys may contain dots, indicating a subfield or positions within an array, like so:

{'fieldname.0.subfieldname': 'property-name'}

Note that the event structure will not be validated until the event is yielded by the Generate() method. This creates the possibility to add nonexistent properties to the attribute map and remove them in the Generate method, which may be convenient for composing properties from multiple JSON values, or for splitting the auto-generated event into multiple output events.

TYPE_REPORTERS_SHORT = {}

The TYPE_REPORTERS_SHORT attribute is a dictionary mapping EDXML event type names to short EDXML reporter strings.

TYPE_REPORTERS_LONG = {}

The TYPE_REPORTERS_LONG attribute is a dictionary mapping EDXML event type names to long EDXML reporter strings.

TYPE_PROPERTIES = {}

The TYPE_PROPERTIES_NAMES attribute is a dictionary mapping EDXML event type names to their properties. For each key (the event type name) there should be a dictionary containing the desired property. A property dictionary must have a key containing the property name and a value containing the name of the object type. Properties will automatically be generated when this constant is populated.

Example:

{'event_type_name': {'property_name': 'object_type_name'}}
PROPERTY_DESCRIPTIONS = {}

The PROPERTY_DESCRIPTIONS attribute is a dictionary mapping EDXML property names to descriptions. It will be used to automatically set the descriptions of any automatically generated properties.

PROPERTY_SIMILARITY = {}

The PROPERTY_SIMILARITY attribute is a dictionary mapping EDXML property names to EDXML ‘similar’ attributes. It will be used to automatically set the similar attributes of any automatically generated properties.

PROPERTY_MERGE_STRATEGIES = {}

The PROPERTY_MERGE_STRATEGIES attribute is a dictionary mapping EDXML property names to EDXML ‘merge’ attributes, which indicate the merge strategy of the property. It will be used to set the merge attribute for any automatically generated properties.

When no merge strategy is given, automatically generated properties will have the default strategy, which is ‘match’ for unique properties and ‘drop’ for all other properties.

For convenience, the EventProperty class defines some class attributes representing the available merge strategies. Example:

{'property_name': EventProperty.MERGE_ADD}
TYPE_UNIQUE_PROPERTIES = {}

The UNIQUE_PROPERTIES attribute is a dictionary mapping EDXML event type names to lists of unique properties. The lists will be used to mark the listed properties as unique in automatically generated properties. Example:

{'event_type_name': ['unique-property-a', 'unique-property-b']}
PARENTS_CHILDREN = []

The PARENTS_CHILDREN attribute is a list containing parent-child event type relations. Each relation is a list containing the event type name of the parent, the EDXML ‘parent-description’ attribute and the event type name of the child event type, in that order. It will be used in conjunction with the CHILDREN_SIBLINGS and PARENT_MAPPINGS attributes to configure event type parents for any automatically generated event types. Example:

PARENTS_CHILDREN = [
  ['parent-event-type-name', 'containing', 'child-event-type-name']
]

Note

Please refer to the EDXML specification for details about how to choose a proper value for the parent-description attribute.

CHILDREN_SIBLINGS = []

The CHILDREN_SIBLINGS attribute is a list containing child-siblings event type relations. Each relation is a list containing the event type name of the child, the EDXML ‘siblings-description’ attribute and the event type name of the parent event type, in that order. Example:

CHILDREN_SIBLINGS = [
  ['child-event-type-name', 'contained in', 'parent-event-type-name']
]

Note

Please refer to the EDXML specification for details about how to choose a proper value for the siblings-description attribute.

PARENT_MAPPINGS = {}

The PARENT_MAPPINGS attribute is a dictionary mapping EDXML event type names to parent property mappings. Each mapping is a dictionary containing properties of the event type as keys and properties of the parent event type as values. Example:

PARENT_MAPPINGS = {
‘child-event-type-name’: {
‘child-property-name-a’: ‘parent-property-name-a’, ‘child-property-name-b’: ‘parent-property-name-b’,

}

}

Note

Please refer to the EDXML specification for details about how parent property mappings work.

Generate(Json, RecordTypeName, **kwargs)

Generates one or more EDXML events from the given JSON record, populating it with properties using the ATTRIBUTE_MAP class property.

The JSON record can be passed either as a dictionary, an object, a dictionary containing objects or an object containing dictionaries. Dictionaries are allowed to contain lists or other dictionaries. For objects, the ATTRIBUTE_MAP will be used to access its attributes. These attributes may in turn be dictionaries, lists or other objects. Using dotted notation in ATTRIBUTE_MAP, you can extract pretty much everything from anything.

This method can be overridden to create a generic event generator, populating the output events with generic properties that may or may not be useful to the record specific transcoders. The record specific transcoders can refine the events that are generated upstream by adding, changing or removing properties, editing the event content, and so on.

Parameters:
  • Json (dict, object) – Decoded JSON data
  • RecordTypeName (str) – The JSON record type
  • **kwargs – Arbitrary keyword arguments
Yields:

EDXMLEvent

static IsPostProcessor()

Returns True if the events generated by Generate() should be passed to the PostProcess() method. This may be useful to generate derivative events based on the events generated by Generate().

This method returns False and can be overridden to return True in stead.

Returns:
Return type:bool
PostProcess(Event)

Generates zero or more EDXML events from the given EDXML input event, possibly altering the passed event in the process. The passed EDXMLEvent instances are taken from the output of the Generate() method.

Parameters:Event (EDXMLEvent) – Input event
Yields:EDXMLEvent
static GetAutoMergeEventTypes()

Returns a list of names of event types that should be automatically merged while generating them. This may be useful to reduce the event output rate when generating large numbers of colliding events.

Returns:
Return type:list[str]
static GenerateObjectTypes()

This method may be used to generate generic EDXML object types that are used by all transcoders.

Yields:ObjectType
GenerateEventTypes()

This method generates event types using the attributes of the various transcoders. This yields a set of preconfigured event types that may optionally be tuned by each of the transcoders. Yields tuples containing pairs of event type names and event type instances.

Yields:tuple[str, EventType]
class edxml.transcode.json.JsonTranscoderMediator(Output)

Bases: edxml.EDXMLBase.EDXMLBase

This class is a mediator between a source of JSON records and a set of JsonTranscoder implementations that can transcode the JSON records into EDXML events.

Sources can instantiate the mediator and feed it JSON records, while transcoders can register themselves with the mediator in order to transcode the types of JSON record that they support.

TYPE_FIELD = None

This constant must be set to the name of the field in the root of the JSON record that contains the JSON record type, allowing the Transcoder Manager to route JSON records to the correct transcoder.

If the constant is set to None, all JSON records will be routed to the fallback transcoder. If there is not fallback transcoder available, the record will not be processed.

Note

The fallback transcoder is a transcoder that registered itself as a transcoder for the record type named ‘JSON_OF_UNKNOWN_TYPE’, which is a reserved name.

classmethod Register(RecordTypeName, Transcoder)

Register a transcoder for processing records of specified type. The same transcoder can be registered for multiple record types. The Transcoder argument must be a JsonTranscoder class or an extension of it. Do not pass in instantiated class, pass the class itself.

Note

Any transcoder that registers itself as a transcoder for the record type named ‘JSON_OF_UNKNOWN_TYPE’ is used as the fallback transcoder. The fallback transcoder is used to transcode any record that has a record type for which no transcoder has been registered.

Parameters:
  • RecordTypeName (str) – Name of the JSON record type
  • Transcoder (JsonTranscoder) – JsonTranscoder class
Debug()

Enable debugging mode, which prints informative messages about JSON transcoding issues, disables event buffering and stops on errors.

Returns:
Return type:JsonTranscoderMediator
DisableEventValidation()

Instructs the EDXML writer not to validate its output. This may be used to boost performance in case you know that the data will be validated at the receiving end, or in case you know that your generator is perfect. :)

Returns:
Return type:JsonTranscoderMediator
DisableObjectValidation()

Instructs the EDXML writer not to validate the objects in the events. The global data stream structure and event structure will still be validated. This may be used to boost performance in case case you know that the data will be validated at the receiving end, or in case you have other means to guarantee that your objects are valid.

Returns:
Return type:JsonTranscoderMediator
IgnoreInvalidObjects()

Instructs the EDXML writer to ignore invalid object values. After calling this method, any event value that fails to validate will be silently dropped.

Note

Dropping object values may lead to invalid events.

Note

This has no effect when object validation is disabled.

Returns:
Return type:JsonTranscoderMediator
IgnoreInvalidEvents(Warn=False)

Instructs the EDXML writer to ignore invalid events. After calling this method, any event that fails to validate will be dropped. If Warn is set to True, a detailed warning will be printed, allowing the source and cause of the problem to be determined.

Note

This also implies that invalid objects will be ignored.

Note

This has no effect when event validation is disabled.

Parameters:Warn (bool) – Print warnings or not
Returns:
Return type:JsonTranscoderMediator
AddEventSource(Source)

Adds an EDXML event source definition. If no event sources are added, a bogus source will be generated.

Warning

In EDXML v3, the source URL is used to compute sticky hashes. Therefore, adjusting the source URLs of events after generating them changes their hashes.

The mediator will not output the EDXML ontology until it receives its first event through the Process() method. This means that the caller can generate an event source ‘just in time’ by inspecting the first record it receives from its input and call this method to add it to the mediator.

Parameters:Source (EventSource) – An EventSource instance
Returns:
Return type:JsonTranscoderMediator
classmethod GetTranscoder(RecordTypeName)

Returns a JsonTranscoder instance for transcoding records of specified type, or None if no transcoder has been registered for the record type.

Parameters:RecordTypeName (str) – Name of the JSON record type
Returns:
Return type:JsonTranscoder
Process(JsonData)

Processes a single JSON record, invoking the correct transcoder to generate an EDXML event and writing the event into the output.

The JSON record must be represented as either a dictionary or an object. When an object is passed, it will attempt to read any attributes listed in the ATTRIBUTE_MAP of the matching transcoder from object attributes. When a dictionary is passed, it will attempt to read keys as listed in ATTRIBUTE_MAP. Using dotted notation, the keys in ATTRIBUTE_MAP can refer to dictionary values that are themselves dictionaries of lists.

Parameters:JsonData (dict,object) – Json dictionary
Close()

Finalizes the transcoding process by flushing the output buffer.