# API Reference¶

## Lexicon classes¶

 lexicon.Attribute(name, att_type[, …]) Attributes are for collecting summary information about attributes of Words or WordTokens, with different types of attributes allowing for different behaviour lexicon.Corpus(name[, update]) Lexicon to store information about Words, such as transcriptions, spellings and frequencies lexicon.Inventory([update]) Inventories contain information about a Corpus’ segmental inventory. This class exists mainly for the purposes lexicon.FeatureMatrix(name, feature_entries) An object that stores feature values for segments lexicon.Segment(symbol[, features]) Class for segment symbols lexicon.Transcription(seg_list) Transcription object, sequence of symbols lexicon.Word(**kwargs) An object representing a word in a corpus lexicon.EnvironmentFilter(middle_segments[, …]) Filter to use for searching words to generate Environments that match lexicon.Environment(middle, position[, lhs, rhs]) Specific sequence of segments that was a match for an EnvironmentFilter

## Speech corpus classes¶

 spontaneous.Discourse(kwargs) Discourse objects are collections of linear text with word tokens spontaneous.Speaker(name, **kwargs) Speaker objects contain information about the producers of WordTokens or Discourses spontaneous.SpontaneousSpeechCorpus(name, …) SpontaneousSpeechCorpus objects a collection of Discourse objects and Corpus objects for frequency information. spontaneous.WordToken([update]) WordToken objects are individual productions of Words

### Corpus context managers¶

 contextmanagers.BaseCorpusContext(corpus, …) Abstract Corpus context class that all other contexts inherit from. contextmanagers.CanonicalVariantContext(…) Corpus context that uses canonical forms for transcriptions and tiers contextmanagers.MostFrequentVariantContext(…) Corpus context that uses the most frequent pronunciation variants for transcriptions and tiers contextmanagers.SeparatedTokensVariantContext(…) Corpus context that treats pronunciation variants as separate types for transcriptions and tiers contextmanagers.WeightedVariantContext(…) Corpus context that weights frequency of pronunciation variants by the number of variants or the token frequency for transcriptions and tiers

## Corpus binaries¶

 binary.download_binary(name, path[, call_back]) Download a binary file of example corpora and feature matrices. binary.load_binary(path) Unpickle a binary file binary.save_binary(obj, path) Pickle a Corpus or FeatureMatrix object for later loading

 csv.load_corpus_csv(corpus_name, path, delimiter) Load a corpus from a column-delimited text file csv.load_feature_matrix_csv(name, path, …) Load a FeatureMatrix from a column-delimited text file

## Export to CSV¶

 csv.export_corpus_csv(corpus, path[, …]) Save a corpus as a column-delimited text file csv.export_feature_matrix_csv(…[, delimiter]) Save a FeatureMatrix as a column-delimited text file

## TextGrids¶

 textgrid.inspect_discourse_textgrid textgrid.load_discourse_textgrid textgrid.load_directory_textgrid

## Running text¶

 text_spelling.inspect_discourse_spelling(path) Generate a list of AnnotationTypes for a specified text file for parsing it as an orthographic text text_spelling.load_discourse_spelling(…[, …]) Load a discourse from a text file containing running text of orthography text_spelling.load_directory_spelling(…[, …]) Loads a directory of orthographic texts text_spelling.export_discourse_spelling(…) Export an orthography discourse to a text file text_transcription.inspect_discourse_transcription(path) Generate a list of AnnotationTypes for a specified text file for parsing it as a transcribed text text_transcription.load_discourse_transcription(…) Load a discourse from a text file containing running transcribed text text_transcription.load_directory_transcription(…) Loads a directory of transcribed texts. text_transcription.export_discourse_transcription(…) Export an transcribed discourse to a text file

## Interlinear gloss text¶

 text_ilg.inspect_discourse_ilg(path[, number]) Generate a list of AnnotationTypes for a specified text file for parsing it as an interlinear gloss text file text_ilg.load_discourse_ilg(corpus_name, …) Load a discourse from a text file containing interlinear glosses text_ilg.load_directory_ilg(corpus_name, …) Loads a directory of interlinear gloss text files text_ilg.export_discourse_ilg(discourse, path) Export a discourse to an interlinear gloss text file, with a maximal line size of 10 words

## Other standards¶

 multiple_files.inspect_discourse_multiple_files(…) Generate a list of AnnotationTypes for a specified dialect multiple_files.load_discourse_multiple_files(…) Load a discourse from a text file containing interlinear glosses multiple_files.load_directory_multiple_files(…) Loads a directory of corpus standard files (separated into words files and phones files)

## Frequency of alternation¶

Frequency of alternation is currently not supported in PCT.

 freq_of_alt.calc_freq_of_alt(corpus_context, …) Returns a double that is a measure of the frequency of alternation of two sounds in a given corpus

 functional_load.minpair_fl functional_load.deltah_fl functional_load.relative_minpair_fl functional_load.relative_deltah_fl

## Kullback-Leibler divergence¶

 kl.KullbackLeibler(corpus_context, seg1, …) Calculates KL distances between two Phoneme objects in some context, either the left or right-hand side.

## Mutual information¶

 mutual_information.pointwise_mi(…[, …]) Calculate the mutual information for a bigram.

## Neighborhood density¶

 neighborhood_density.neighborhood_density(…) Calculate the neighborhood density of a particular word in the corpus. neighborhood_density.find_mutation_minpairs(…) Find all minimal pairs of the query word based only on segment mutations (not deletions/insertions)

## Phonotactic probability¶

 phonotactic_probability.phonotactic_probability_vitevitch(…) Calculate the phonotactic_probability of a particular word using the Vitevitch & Luce algorithm

## Predictability of distribution¶

 pred_of_dist.calc_prod_all_envs(…[, …]) Main function for calculating predictability of distribution for two segments over a corpus, regardless of environment. pred_of_dist.calc_prod(corpus_context, envs) Main function for calculating predictability of distribution for two segments over specified environments in a corpus.

## Symbol similarity¶

 string_similarity.string_similarity(…) This function computes similarity of pairs of words across a corpus.
 edit_distance.edit_distance(word1, word2, …) Returns the Levenshtein edit distance between a string from two words word1 and word2, code drawn from http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Python.
 khorsi.khorsi(word1, word2, freq_base, …) Calculate the string similarity of two words given a set of characters and their frequencies in a corpus based on Khorsi (2012)
 phono_edit_distance.phono_edit_distance(…) Returns an analogue to Levenshtein edit distance but uses phonological _features instead of characters