Discourse

class corpustools.corpus.classes.spontaneous.Discourse(kwargs)[source]

Discourse objects are collections of linear text with word tokens

Parameters:

name : str

Identifier for the Discourse

speaker : Speaker

Speaker producing the tokens/text (defaults to an empty Speaker)

Attributes

attributes (list of Attributes) The Discourse object tracks all of the attributes used by its WordToken objects
words (dict of WordTokens) The keys are the beginning times of the WordTokens (or their place in a text if it’s not a speech discourse) and the values are the WordTokens

Methods

__init__(kwargs)
add_attribute(attribute[, initialize_defaults]) Add an Attribute of any type to the Discourse or replace an existing Attribute.
add_word(wordtoken) Adds a WordToken to the Discourse
create_lexicon() Create a Corpus object from the Discourse
find_wordtype(wordtype) Look up all WordTokens that are instances of a Word
keys() Returns a sorted list of keys for looking up WordTokens
add_attribute(attribute, initialize_defaults=False)[source]

Add an Attribute of any type to the Discourse or replace an existing Attribute.

Parameters:

attribute : Attribute

Attribute to add or replace

initialize_defaults : bool

If True, word tokens will have this attribute set to the default_value of the attribute, defaults to False

add_word(wordtoken)[source]

Adds a WordToken to the Discourse

Parameters:

wordtoken : WordToken

WordToken to be added

create_lexicon()[source]

Create a Corpus object from the Discourse

Returns:

Corpus

Corpus with spelling and transcription from previous Corpus and token frequency from the Discourse

find_wordtype(wordtype)[source]

Look up all WordTokens that are instances of a Word

Parameters:

wordtype : Word

Word to look up

Returns:

list of WordTokens

List of the given Word’s WordTokens in this Discourse

has_audio

Checks whether the Discourse is associated with a .wav file

Returns:

bool

True if a .wav file is associated and if that file exists, False otherwise

keys()[source]

Returns a sorted list of keys for looking up WordTokens

Returns:

list

List of begin times or indices of WordTokens in the Discourse