EDXML Event Representations

EDXML events are represented as instances of the EDXMLEvent class and its subclasses.

Accessing Properties

The event properties are accessible by means of the properties attribute. The objects of each property are stored as sets. To make accessing properties as convenient as possible, events implement a MutableMapping, allowing more direct access. A quick example:

from edxml import EDXMLEvent

# Create an event
event = EDXMLEvent(properties={'names': {'Alice', 'Bob'}})

for property_name, object_values in event.items():
    for object_value in object_values:
        print(object_value)

Note that the event that we created above is incomplete. It is not even valid. We did not set an event type or an event source. EDXMLEvent instances are not bound to an ontology. As such there is no concept of validity in its instances. You can create any event you like, valid or invalid. Validating an instance can be done by means of its is_valid() method, which accepts an ontology as parameter.

Due to events not being bound to an ontology there is no differentiation between a property that does not exist and a property that exists but is lacking any objects. Therefore, checking if the properties dictionary has a certain key will return False even if that key has been assigned an empty set.

Writing event properties works just as you would expect. Some examples are shown below.

from edxml import EDXMLEvent

# Create an event
event = EDXMLEvent()

# Assign properties in one go
event.properties = {'names': {'Alice', 'Bob'}}

# Single values are wrapped into a set automatically
event.properties['names'] = 'Alice'

# The above could be shortened like this:
event['names'] = 'Alice'

# Add an object value
event['names'].add('Bob')

# Clear property
del event['names']

Accessing Attachments

Event attachments can be accessed by means of the attachments attribute. This attribute is a dictionary mapping attachment names to the attachment values. The attachment values are also a dictionary mapping attachment identifiers to their string values. Another quick example:

from edxml import EDXMLEvent

# Create an event
event = EDXMLEvent(properties={'names': {'Alice'}})

# Add a 'document' attachment with its SHA1 hash as ID
event.attachments['document'] = 'FooBar'

# Add a 'document' attachment while explicitly setting its ID
event.attachments['document'] = {'1': 'FooBar'}

As you can see in the above example, explicitly setting the identifiers of individual attachment values is not needed. When omitted, the SHA1 hashes of the attachment values will be used as identifiers.

EDXMLEvent Subclasses

The EDXMLEvent class has two subclasses. The first one is the ParsedEvent class. As the name suggests, this class is instantiated by EDXML parsers. In fact, it can only be instantiated by lxml, which is the library that the EDXML parser is built on. Its instances are a mix of a regular EDXMLEvent and a etree.Element instance. The reason for a separate parsed event variant is performance: The lxml library can generate these objects at minimal cost and can be passed through to EDXML writers for re-serialization at minimal cost.

The second subclass of EDXMLEvent is EventElement. This class is a wrapper around an lxml etree.Element instance containing an <event> XML element, providing the same convenient access interface as its parent, EDXMLEvent. The EventElement is mainly used for generating events that are intended for feeding to EDXML writers.

Object Value Types

Since EDXML data is XML, all object values in an EDXML document are strings. As a result, the events that are generated by the parser will only contain values of type str. When writing object values into an event Python types other than str can be used. For example, writing a float into an event object is perfectly fine.

What happens when a non-string value is written into an event depends on the particular event implementation. The base class EDXMLEvent does not care about the values stored in its properties, until it needs to be converted into an EventElement. This happens when the event is written using a transcoder or using the EDXMLWriter. At that point any non-string values are converted into strings. If that fails, an exception is raised.

Instances of EventElement are both an EDXML event and an XML element and the conversion to strings happens immediately when a property is written. This means that, in general, writing an event property may raise an exception.

The EDXML event implementations can convert various Python types into strings. These types include float, bool, datetime, Decimal and IP (from IPy).

Illegal XML characters

Some types of characters are illegal in XML. For that reason writing an object value string into an event can raise a ValueError. Using the replace_invalid_characters() function illegal characters can be automatically replaced by replacement characters.

Class Documentation

The class documentation of the various event implementations can be found below.

EDXMLEvent

class edxml.EDXMLEvent(properties=None, event_type_name=None, source_uri=None, parents=None, attachments=None, foreign_attribs=None)

Bases: collections.abc.MutableMapping

Class representing an EDXML event.

The event allows its properties to be accessed and set much like a dictionary:

Event[‘property-name’] = ‘value’

Note

Properties are sets of object values. On assignment, single values are automatically wrapped into sets.

Creates a new EDXML event. The Properties argument must be a dictionary mapping property names to object values. Object values must be lists of one or multiple object values. Explicit parent hashes must be specified as hex encoded strings. Attachments must be specified as a dictionary mapping attachment names to attachment values. The attachment values are dictionaries mapping attachment identifiers to the actual attachment strings.

Parameters:
  • properties (Optional[Dict[str,Set[str]]]) – Dictionary of properties
  • event_type_name (Optional[str]) – Name of the event type
  • source_uri (Optional[str]) – Event source URI
  • parents (Optional[List[str]]) – List of explicit parent hashes
  • attachments (Optional[Dict[str, Dict[str]]]) – Event attachments dictionary
  • foreign_attribs (Optional[Dict[str, str]]) –
Returns:

EDXMLEvent

replace_invalid_characters(replace=True)

Enables automatic replacement of invalid unicode characters with the unicode replacement character. This will be used to produce valid XML representations of events containing invalid unicode characters in their property objects or attachments.

Enabling this feature may be useful when dealing with broken input data that triggers an occasional ValueError. In stead of crashing, the invalid data will be automatically replaced.

Parameters:replace (bool) –
Returns:edxml.EDXMLEvent
properties

Class property storing the properties of the event.

Returns:Event properties
Return type:PropertySet
attachments

Class property storing the attachments of the event.

Returns:Dict[str, str]
get_any(property_name, default=None)

Convenience method for fetching any of possibly multiple object values of a specific event property. If the requested property has no object values, the specified default value is returned in stead.

Parameters:
  • property_name (string) – Name of requested property
  • default – Default return value

Returns:

get_element(sort=False)

Returns the event as XML element. When the sort parameter is set to True, the properties, attachments and event parents are sorted as required for obtaining the event in its normal form as defined in the EDXML specification.

Parameters:sort (bool) – Sort element components
Returns:
Return type:etree.Element
copy()

Returns a copy of the event.

Returns:EDXMLEvent
classmethod create(properties=None, event_type_name=None, source_uri=None, parents=None, attachments=None)

Creates a new EDXML event. The Properties argument must be a dictionary mapping property names to object values. Object values may be single values or a list of multiple object values. Explicit parent hashes must be specified as hex encoded strings.

Attachments are specified by means of a dictionary mapping attachment names to strings.

Note

For a slight performance gain, use the EDXMLEvent constructor directly to create new events.

Parameters:
  • properties (Optional[Dict[str,Union[str,List[str]]]]) – Dictionary of properties
  • event_type_name (Optional[str]) – Name of the event type
  • source_uri (Optional[str]) – Event source URI
  • parents (Optional[List[str]]) – List of explicit parent hashes
  • attachments (Optional[str]) – Event attachments dictionary
Returns:

Return type:

EDXMLEvent

get_type_name()

Returns the name of the event type.

Returns:The event type name
Return type:str
get_source_uri()

Returns the URI of the event source.

Returns:The source URI
Return type:str
get_properties()

Returns a dictionary containing property names as keys. The values are lists of object values.

Returns:Event properties
Return type:PropertySet
get_parent_hashes()

Returns a list of sticky hashes of parent events. The hashes are hex encoded strings.

Returns:List of parent hashes
Return type:List[str]
get_attachments()

Returns the attachments of the event as a dictionary mapping attachment names to the attachment values

Returns:Event attachments
Return type:Dict[str, str]
get_foreign_attributes()

Returns any non-edxml event attributes as a dictionary having the attribute names as keys and their associated values. The namespace is prepended to the keys in James Clark notation:

{‘{http://some/foreign/namespace}attribute’: ‘value’

Returns: Dict[str, str]

set_properties(properties)

Replaces the event properties with the properties from specified dictionary. The dictionary must contain property names as keys. The values must be lists of strings.

Parameters:properties – Dict(str, List(str)): Event properties
Returns:
Return type:EDXMLEvent
copy_properties_from(source_event, property_map)

Copies properties from another event, mapping property names according to specified mapping. The property_map argument is a dictionary mapping property names from the source event to property names in the target event, which is the event that is used to call this method.

If multiple source properties map to the same target property, the objects of both properties will be combined in the target property.

Parameters:
  • source_event (EDXMLEvent) –
  • property_map (dict(str,str)) –
Returns:

Return type:

EDXMLEvent

move_properties_from(source_event, property_map)

Moves properties from another event, mapping property names according to specified mapping. The property_map argument is a dictionary mapping property names from the source event to property names in the target event, which is the event that is used to call this method.

If multiple source properties map to the same target property, the objects of both properties will be combined in the target property.

Parameters:
  • source_event (EDXMLEvent) –
  • property_map (dict(str,str)) –
Returns:

Return type:

EDXMLEvent

set_type(event_type_name)

Set the event type.

Parameters:event_type_name (str) – Name of the event type
Returns:
Return type:EDXMLEvent
set_attachment(name, attachment)

Set the event attachment associated with the specified name in the event type definition. The attachment argument accepts a string value. Alternatively a list can be given, allowing for multi-valued attachments. In that case, each attachment will have its SHA1 hash as unique identifier. Lastly, the attachment can be specified as a dictionary containing attachment identifiers as keys and the attachment strings as values. This allows control over choosing attachment identifiers.

Specifying None as attachment value removes the attachment from the event.

Parameters:
  • name (str) – Associated name in event type definition
  • attachment (Union[Optional[str], List[Optional[str]], Dict[str, Optional[str]]]) – Attachment dictionary
Returns:

Return type:

EDXMLEvent

set_source(source_uri)

Set the event source.

Parameters:source_uri (str) – EDXML source URI
Returns:
Return type:EDXMLEvent
add_parents(parent_hashes)

Add the specified sticky hashes to the list of explicit event parents.

Parameters:parent_hashes (List[str]) – list of sticky hash, as hexadecimal strings
Returns:
Return type:EDXMLEvent
set_parents(parent_hashes)

Replace the set of explicit event parents with the specified list of sticky hashes.

Parameters:parent_hashes (List[str]) – list of sticky hash, as hexadecimal strings
Returns:
Return type:EDXMLEvent
set_foreign_attributes(attribs)

Sets foreign attributes. Foreign attributes are XML attributes not specified by EDXML and have a namespace that is not the EDXML namespace. The attributes can be passed as a dictionary. The keys in the dictionary must include the namespace in James Clark notation. Example:

{‘{http://some/namespace}attribute_name’: ‘attribute_value’}

Parameters:attribs (Dict[str,str]) – Attribute dictionary
Returns:
Return type:EDXMLEvent
compute_sticky_hash(event_type, hash_function=<built-in function openssl_sha1>, encoding='hex')

Computes the sticky hash of the event. By default, the hash will be computed using the SHA1 hash function and encoded into a hexadecimal string. The hashing function can be adjusted to any of the hashing functions in the hashlib module. The encoding can be adjusted by setting the encoding argument to any string encoding that is supported by the str.encode() method.

Parameters:
  • event_type (edxml.ontology.EventType) – The event type
  • hash_function (callable) – The hashlib hash function to use
  • encoding (str) – Desired output encoding
Returns:

String representation of the hash.

Return type:

str

is_valid(ontology)

Check if an event is valid for a given ontology.

Parameters:ontology (edxml.ontology.Ontology) – An EDXML ontology
Returns:True if the event is valid
Return type:bool

ParsedEvent

class edxml.ParsedEvent(properties=None, event_type_name=None, source_uri=None, parents=None, attachments=None, foreign_attribs=None)

Bases: edxml.event.EDXMLEvent, lxml.etree.ElementBase

This class extends both EDXMLEvent and etree.ElementBase to provide an EDXML event representation that can be generated directly by the lxml parser and can be treated much like it was a normal lxml Element representing an ‘event’ element

Note

The list and dictionary interfaces of etree.ElementBase are overridden by EDXMLEvent, so accessing keys will yield event properties rather than the XML attributes of the event element.

Note

This class can only be instantiated by parsers.

flush()

This class caches an alternative representation of the lxml Element, for internal use. Whenever the lxml Element is modified without using the dictionary interface, the flush() method must be called in order to refresh the internal state.

Returns:
Return type:ParsedEvent
copy()

Returns a copy of the event.

Returns:
Return type:ParsedEvent
classmethod create(properties=None, event_type_name=None, source_uri=None, parents=None, attachments=None)

This override of the create() method of the EDXMLEvent class only raises exceptions, because ParsedEvent objects can only be created by parsers.

Raises:NotImplementedError
get_properties()

Returns a dictionary containing property names as keys. The values are lists of object values.

Returns:Event properties
Return type:PropertySet
get_attachments()

Returns the attachments of the event as a dictionary mapping attachment names to the attachment values

Returns:Event attachments
Return type:Dict[str, str]
get_foreign_attributes()

Returns any non-edxml event attributes as a dictionary having the attribute names as keys and their associated values. The namespace is prepended to the keys in James Clark notation:

{‘{http://some/foreign/namespace}attribute’: ‘value’

Returns: Dict[str, str]

get_parent_hashes()

Returns a list of sticky hashes of parent events. The hashes are hex encoded strings.

Returns:List of parent hashes
Return type:List[str]
get_element(sort=False)

Returns the event as XML element. When the sort parameter is set to True, the properties, attachments and event parents are sorted as required for obtaining the event in its normal form as defined in the EDXML specification.

Parameters:sort (bool) – Sort element components
Returns:
Return type:etree.Element
set_properties(properties)

Replaces the event properties with the properties from specified dictionary. The dictionary must contain property names as keys. The values must be lists of strings.

Parameters:properties – Dict(str, List(str)): Event properties
Returns:
Return type:EDXMLEvent
set_attachment(name, attachment)

Set the event attachment associated with the specified name in the event type definition. The attachment argument accepts a string value. Alternatively a list can be given, allowing for multi-valued attachments. In that case, each attachment will have its SHA1 hash as unique identifier. Lastly, the attachment can be specified as a dictionary containing attachment identifiers as keys and the attachment strings as values. This allows control over choosing attachment identifiers.

Specifying None as attachment value removes the attachment from the event.

Parameters:
  • name (str) – Associated name in event type definition
  • attachment (Union[Optional[str], List[Optional[str]], Dict[str, Optional[str]]]) – Attachment dictionary
Returns:

Return type:

ParsedEvent

add_parents(parent_hashes)

Add the specified sticky hashes to the list of explicit event parents.

Parameters:parent_hashes (List[str]) – list of sticky hashes, as hexadecimal strings
Returns:
Return type:ParsedEvent
set_parents(parent_hashes)

Replace the set of explicit event parents with the specified list of sticky hashes.

Parameters:parent_hashes (List[str]) – list of sticky hashes, as hexadecimal strings
Returns:
Return type:ParsedEvent
set_foreign_attributes(attribs)

Sets foreign attributes. Foreign attributes are XML attributes not specified by EDXML and have a namespace that is not the EDXML namespace. The attributes can be passed as a dictionary. The keys in the dictionary must include the namespace in James Clark notation. Example:

{‘{http://some/namespace}attribute_name’: ‘attribute_value’}

Parameters:attribs (Dict[str,str]) – Attribute dictionary
Returns:
Return type:EDXMLEvent
get_type_name()

Returns the name of the event type.

Returns:The event type name
Return type:str
get_source_uri()

Returns the URI of the event source.

Returns:The source URI
Return type:str
set_type(event_type_name)

Set the event type.

Parameters:event_type_name (str) – Name of the event type
Returns:
Return type:EDXMLEvent
set_source(source_uri)

Set the event source.

Parameters:source_uri (str) – EDXML source URI
Returns:
Return type:EDXMLEvent

EventElement

class edxml.EventElement(properties=None, event_type_name=None, source_uri=None, parents=None, attachments=None, foreign_attribs=None)

Bases: edxml.event.EDXMLEvent

This class extends EDXMLEvent to provide an EDXML event representation that wraps an etree Element instance, providing a convenient means to generate and manipulate EDXML <event> elements. Using this class is preferred over using EDXMLEvent if you intend to feed it to EDXMLWriter.

Creates a new EDXML event. The Properties argument must be a dictionary mapping property names to object values. Object values must be lists of one or multiple strings. Explicit parent hashes must be specified as hex encoded strings. Attachments must be specified as a dictionary mapping attachment names to attachment values. The attachment values are dictionaries mapping attachment identifiers to the actual attachment strings.

Parameters:
  • properties (Dict(str, List[str])) – Dictionary of properties
  • event_type_name (Optional[str]) – Name of the event type
  • source_uri (Optional[optional]) – Event source URI
  • parents (Optional[List[str]]) – List of explicit parent hashes
  • attachments (Optional[Dict[str, Dict[str, str]]]) – Event attachments dictionary
  • foreign_attribs (Optional[Dict[str, str]]) –
Returns:

Return type:

EventElement

get_element(sort=False)

Returns the event as XML element. When the sort parameter is set to True, the properties, attachments and event parents are sorted as required for obtaining the event in its normal form as defined in the EDXML specification.

Parameters:sort (bool) – Sort element components
Returns:
Return type:etree.Element
copy()

Returns a copy of the event.

Returns:
Return type:EventElement
classmethod create(properties=None, event_type_name=None, source_uri=None, parents=None, attachments=None)

Creates a new EDXML event. The Properties argument must be a dictionary mapping property names to object values. Object values may be single values or a list of multiple object values. Explicit parent hashes must be specified as hex encoded strings.

Note

For a slight performance gain, use the EventElement constructor directly to create new events.

Parameters:
  • properties (Optional[Dict[str,Union[str,List[str]]]]) – Dictionary of properties
  • event_type_name (Optional[str]) – Name of the event type
  • source_uri (Optional[str]) – Event source URI
  • parents (Optional[List[str]]) – List of explicit parent hashes
  • attachments (Optional[Dict[str, Dict[str, str]]]) – Event attachments dictionary
Returns:

Return type:

EventElement

classmethod create_from_event(event)

Creates and returns a new EventElement instance by reading it from another EDXML event.

Parameters:event (EDXMLEvent) – The EDXML event to copy data from
Returns:
Return type:EventElement
get_properties()

Returns a dictionary containing property names as keys. The values are lists of object values.

Returns:Event properties
Return type:PropertySet
get_attachments()

Returns the attachments of the event as a dictionary mapping attachment names to the attachment values

Returns:Event attachments
Return type:Dict[str, str]
get_foreign_attributes()

Returns any non-edxml event attributes as a dictionary having the attribute names as keys and their associated values. The namespace is prepended to the keys in James Clark notation:

{‘{http://some/foreign/namespace}attribute’: ‘value’

Returns: Dict[str, str]

get_parent_hashes()

Returns a list of sticky hashes of parent events. The hashes are hex encoded strings.

Returns:List of parent hashes
Return type:List[str]
get_type_name()

Returns the name of the event type.

Returns:The event type name
Return type:str
get_source_uri()

Returns the URI of the event source.

Returns:The source URI
Return type:str
set_properties(properties)

Replaces the event properties with the properties from specified dictionary. The dictionary must contain property names as keys. The values must be lists of strings.

Parameters:properties – Dict(str, List(str)): Event properties
Returns:
Return type:EventElement
set_attachment(name, attachment)

Set the event attachment associated with the specified name in the event type definition. The attachment argument accepts a string value. Alternatively a list can be given, allowing for multi-valued attachments. In that case, each attachment will have its SHA1 hash as unique identifier. Lastly, the attachment can be specified as a dictionary containing attachment identifiers as keys and the attachment strings as values. This allows control over choosing attachment identifiers.

Specifying None as attachment value removes the attachment from the event.

Parameters:
  • name (str) – Associated name in event type definition
  • attachment (Union[Optional[str], List[Optional[str]], Dict[str, Optional[str]]]) – Attachment dictionary
Returns:

Return type:

EventElement

add_parents(parent_hashes)

Add the specified sticky hashes to the list of explicit event parents.

Parameters:parent_hashes (List[str]) – list of sticky hashes, as hexadecimal strings
Returns:
Return type:EventElement
set_parents(parent_hashes)

Replace the set of explicit event parents with the specified list of sticky hashes.

Parameters:parent_hashes (List[str]) – list of sticky hashes, as hexadecimal strings
Returns:
Return type:EventElement
set_foreign_attributes(attribs)

Sets foreign attributes. Foreign attributes are XML attributes not specified by EDXML and have a namespace that is not the EDXML namespace. The attributes can be passed as a dictionary. The keys in the dictionary must include the namespace in James Clark notation. Example:

{‘{http://some/namespace}attribute_name’: ‘attribute_value’}

Parameters:attribs (Dict[str,str]) – Attribute dictionary
Returns:
Return type:EDXMLEvent
set_type(event_type_name)

Set the event type.

Parameters:event_type_name (str) – Name of the event type
Returns:
Return type:EDXMLEvent
set_source(source_uri)

Set the event source.

Parameters:source_uri (str) – EDXML source URI
Returns:
Return type:EDXMLEvent