# API Reference¶

## Lexicon classes¶

 lexicon.Attribute(name, att_type[, …]) Attributes are for collecting summary information about attributes of lexicon.Corpus(name[, update]) Lexicon to store information about Words, such as transcriptions, lexicon.Inventory([update]) Inventories contain information about a Corpus’ segmental inventory. lexicon.FeatureMatrix(name, feature_entries) An object that stores feature values for segments lexicon.Segment(symbol[, features]) Class for segment symbols lexicon.Transcription(seg_list) Transcription object, sequence of symbols lexicon.Word([update]) An object representing a word in a corpus lexicon.EnvironmentFilter(middle_segments[, …]) Filter to use for searching words to generate Environments that match lexicon.Environment(middle, position[, lhs, rhs]) Specific sequence of segments that was a match for an EnvironmentFilter

## Speech corpus classes¶

 spontaneous.Discourse(kwargs) Discourse objects are collections of linear text with word tokens spontaneous.Speaker(name, **kwargs) Speaker objects contain information about the producers of WordTokens spontaneous.SpontaneousSpeechCorpus(name, …) SpontaneousSpeechCorpus objects a collection of Discourse objects and Corpus objects for frequency information. spontaneous.WordToken([update]) WordToken objects are individual productions of Words

### Corpus context managers¶

 contextmanagers.BaseCorpusContext(corpus, …) Abstract Corpus context class that all other contexts inherit from. contextmanagers.CanonicalVariantContext(…) Corpus context that uses canonical forms for transcriptions and tiers contextmanagers.MostFrequentVariantContext(…) Corpus context that uses the most frequent pronunciation variants contextmanagers.SeparatedTokensVariantContext(…) Corpus context that treats pronunciation variants as separate types contextmanagers.WeightedVariantContext(…) Corpus context that weights frequency of pronunciation variants by the

## Corpus binaries¶

 binary.download_binary(name, path[, call_back]) Download a binary file of example corpora and feature matrices. binary.load_binary(path) Unpickle a binary file binary.save_binary(obj, path) Pickle a Corpus or FeatureMatrix object for later loading

 csv.load_corpus_csv(corpus_name, path, delimiter) Load a corpus from a column-delimited text file csv.load_feature_matrix_csv(name, path, …) Load a FeatureMatrix from a column-delimited text file

## Export to CSV¶

 csv.export_corpus_csv(corpus, path[, …]) Save a corpus as a column-delimited text file csv.export_feature_matrix_csv(…[, delimiter]) Save a FeatureMatrix as a column-delimited text file

## TextGrids¶

 textgrid.inspect_discourse_textgrid textgrid.load_discourse_textgrid textgrid.load_directory_textgrid

## Running text¶

 text_spelling.inspect_discourse_spelling(path) Generate a list of AnnotationTypes for a specified text file for parsing text_spelling.load_discourse_spelling(…[, …]) Load a discourse from a text file containing running text of text_spelling.load_directory_spelling(…[, …]) Loads a directory of orthographic texts text_spelling.export_discourse_spelling(…) Export an orthography discourse to a text file text_transcription.inspect_discourse_transcription(path) Generate a list of AnnotationTypes for a specified text file for parsing text_transcription.load_discourse_transcription(…) Load a discourse from a text file containing running transcribed text text_transcription.load_directory_transcription(…) Loads a directory of transcribed texts. text_transcription.export_discourse_transcription(…) Export an transcribed discourse to a text file

## Interlinear gloss text¶

 text_ilg.inspect_discourse_ilg(path[, number]) Generate a list of AnnotationTypes for a specified text file for parsing text_ilg.load_discourse_ilg(corpus_name, …) Load a discourse from a text file containing interlinear glosses text_ilg.load_directory_ilg(corpus_name, …) Loads a directory of interlinear gloss text files text_ilg.export_discourse_ilg(discourse, path) Export a discourse to an interlinear gloss text file, with a maximal

## Other standards¶

 multiple_files.inspect_discourse_multiple_files(…) Generate a list of AnnotationTypes for a specified dialect multiple_files.load_discourse_multiple_files(…) Load a discourse from a text file containing interlinear glosses multiple_files.load_directory_multiple_files(…) Loads a directory of corpus standard files (separated into words files

## Frequency of alternation¶

 freq_of_alt.calc_freq_of_alt(corpus_context, …) Returns a double that is a measure of the frequency of

 functional_load.minpair_fl(corpus_context, …) Calculate the functional load of the contrast between two segments as a count of minimal pairs. functional_load.deltah_fl(corpus_context, …) Calculate the functional load of the contrast between between two segments as the decrease in corpus entropy caused by a merger. functional_load.relative_minpair_fl(…[, …]) Calculate the average functional load of the contrasts between a segment and all other segments, as a count of minimal pairs. functional_load.relative_deltah_fl(…[, …]) Calculate the average functional load of the contrasts between a segment and all other segments, as the decrease in corpus entropy caused by a merger.

## Kullback-Leibler divergence¶

 kl.KullbackLeibler(corpus_context, seg1, …) Calculates KL distances between two Phoneme objects in some context, either the left or right-hand side.

## Mutual information¶

 mutual_information.pointwise_mi(…[, …]) Calculate the mutual information for a bigram.

## Neighborhood density¶

 neighborhood_density.neighborhood_density(…) Calculate the neighborhood density of a particular word in the corpus. neighborhood_density.find_mutation_minpairs(…) Find all minimal pairs of the query word based only on segment

## Phonotactic probability¶

 phonotactic_probability.phonotactic_probability_vitevitch(…) Calculate the phonotactic_probability of a particular word using

## Predictability of distribution¶

 pred_of_dist.calc_prod_all_envs(…[, …]) Main function for calculating predictability of distribution for two segments over a corpus, regardless of environment. pred_of_dist.calc_prod(corpus_context, envs) Main function for calculating predictability of distribution for two segments over specified environments in a corpus.

## Symbol similarity¶

 string_similarity.string_similarity(…) This function computes similarity of pairs of words across a corpus.
 edit_distance.edit_distance(word1, word2, …) Returns the Levenshtein edit distance between a string from two words word1 and word2, code drawn from http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Python.
 khorsi.khorsi(word1, word2, freq_base, …) Calculate the string similarity of two words given a set of
 phono_edit_distance.phono_edit_distance(…) Returns an analogue to Levenshtein edit distance but uses