Attribute

class corpustools.corpus.classes.lexicon.Attribute(name, att_type, display_name=None, default_value=None)[source]

Attributes are for collecting summary information about attributes of Words or WordTokens, with different types of attributes allowing for different behaviour

Parameters:

name : str

Python-safe name for using getattr and setattr on Words and WordTokens

att_type : str

Either ‘spelling’, ‘tier’, ‘numeric’ or ‘factor’

display_name : str

Human-readable name of the Attribute, defaults to None

default_value : object

Default value for initializing the attribute

Attributes

name (string) Python-readable name for the Attribute on Word and WordToken objects
display_name (string) Human-readable name for the Attribute
default_value (object) Default value for the Attribute. The type of default_value is dependent on the attribute type. Numeric Attributes have a float default value. Factor and Spelling Attributes have a string default value. Tier Attributes have a Transcription default value.
range (object) Range of the Attribute, type depends on the attribute type. Numeric Attributes have a tuple of floats for the range for the minimum and maximum. The range for Factor Attributes is a set of all factor levels. The range for Tier Attributes is the set of segments in that tier across the corpus. The range for Spelling Attributes is None.

Methods

__init__(name, att_type[, display_name, ...])
guess_type(values[, trans_delimiters]) Guess the attribute type for a sequence of values
sanitize_name(name) Sanitize a display name into a Python-readable attribute name
update_range(value) Update the range of the Attribute with the value specified.
static guess_type(values, trans_delimiters=None)[source]

Guess the attribute type for a sequence of values

Parameters:

values : list

List of strings to evaluate for the attribute type

trans_delimiters : list, optional

List of delimiters to look for in transcriptions, defaults to ., ;, and ,

Returns:

str

Attribute type that had the most success in parsing the values specified

static sanitize_name(name)[source]

Sanitize a display name into a Python-readable attribute name

Parameters:

name : string

Display name to sanitize

Returns:

str

Sanitized name

update_range(value)[source]

Update the range of the Attribute with the value specified. If the attribute is a Factor, the value is added to the set of levels. If the attribute is Numeric, the value expands the minimum and maximum values, if applicable. If the attribute is a Tier, the value (a segment) is added to the set of segments allowed. If the attribute is Spelling, nothing is done.

Parameters:

value : object

Value to update range with, the type depends on the attribute type