CNA_Database.py

Meow

CNA_Database.py, Geoffrey Weal, 29/10/2018

This script holds the information required to make a CNA_database.

class Organisms.GA.SCM_Scripts.CNA_Database.CNA_Database(rCuts, population, cut_off_similarity, get_single_CNA_profile_method, get_similarity_profile_method, no_of_cpus, debug)

This is a database that holds all the entries of CNA_Entry objects.

Parameters:

rCuts (float) – These are the rCut values to scan the CNA across.
population (Organisms.GA.Population) –
cut_off_similarity (float) – The maximum similarity above which to be considered similar enough to exclude offspring from being accessed into future generations.
get_single_CNA_profile_method (__func__) – This is the method to get the CNA Profile, either from the T-SCM or the A-SCM.
get_similarity_profile_method (__func__) – This is the method that uses the CNA profiles of two clusters in order to give the similarity profile between the two clusters, Whether this is obtained by the T-SCM or the A-SCM.
no_of_cpus (int) – The number of cpu to use to obtain the similarity profile between two clusters
debug (bool.) – Get data to use to debug the SCM. Default: False

add(collection, initialise=False)

Add the similarity data of the clusters in the collection to the CNA_Database

Parameters:

collection (Organisms.GA.Collection) – The clusters to add to the CNA_Database
initialise (bool) – Are the clusters from a newly created population. Default: False

check_database(population)

Check the database to make sure there are the correct number of entries in the database.

Parameters:: population (Organisms.GA.Population) – The population

get_all_averages_for_a_cluster(cluster_name)

Get the averages SCM similarities of a cluser compared to every other clusrer in the similarity profile.

Parameters:: cluster_name (list of floats) – Cluster to get the averages similarities of.
Returns:: all the average similarities between cluster_name any every other cluster in the database.
Return type:: list of float

get_details()

Retun the information about the CNA Database, including:

Returns:: The cut_off_similarity, get_single_CNA_method, get_similarity_profile_method, no_of_cpus, debug
Return type:: (float, __func__, __func__, int, bool)

get_max_similarity(name_1, name_2)

This def will obtain the max similarity percentage from the compararison of two clusters, name_1 and name_2.

Parameters:

name_1 (int) – The name of the first cluster you would like to look for an entry in the CNA_Database.
name_2 (int) – The name of the second cluster you would like to look for an entry in the CNA_Database.

Returns:

returns the maximum similarity between cluster name_1 and name_2

Return type:

float

get_similar_clusters_in_database()

Get clusters in the clusters that are deemed structurally similar in the CNA_Database

Returns:: a list of the names of all the clusters thaat are similar to each other.
Return type:: list of int

is_pair_in_the_database(dir_1, dir_2)

Determine if a CNA entry exists for two clusters in the similarity_profile_database

Parameters:

name1 (int) – The dir of the first cluster you would like to look for an entry in the CNA_Database.
name2 (int) – The dir of the second cluster you would like to look for an entry in the CNA_Database.

Returns:

True if the similarity profile for cluster name1 and name2, False if not.

Return type:

bool

keys()

Return a list of the names of all the clusters in the database

Returns:: A list of the names of all the clusters in the database
Return type:: list of int

make_simple_table(similarity_profile_database)

This is a simpled table that can be printed to the terminal that shows all the similarities between clusters in the population + offspring

Parameters:: similarity_profile_database (Organisms.GA.SCM_Scripts.CNA_Database.CNA_Database) – This is the CNA database that contains all similarity information of clusters in rhe population and offspring.

print_cna_database_details(): Print information about the database.

remove(names_to_remove)

Remove all entries that exists in the database that are associated with a particular cluster.

Parameters:: name_to_remove (int) – The name of the cluster you would like to remove from the CNA_Database.

reset(): Reset the CNA profiles of generated clusters and the similarity profile database.

class Organisms.GA.SCM_Scripts.CNA_Database.Tree

This is a Tree designed for the CNA_Database to hold references of CNA_Entry.

see_tree(): This shown the clusters in the database

Organisms.GA.SCM_Scripts.CNA_Database.cna_profile_generator(collection, rCuts)

This is a generator that returns the clusters in the collection with the rCut values to scan across.

Parameters:

collection (Organisms.GA.Collection) – A collection
rCuts (list of float) – The list of clusters to scan across with the SCM.

Returns:

a tuple of the cluster the rCut values to scan across with the SCM

Organisms.GA.SCM_Scripts.CNA_Database.initial_similarity_profile_generator(collection, cna_database)

This is a generator that returns the clusters in the collection with the rCut values to scan across.

Parameters:

collection (Organisms.GA.Collection) – A collection
cna_database (list of float) – The list of clusters to scan across with the SCM.

Returns:

a tuple of the names of the clusters and their associated CNA profiles.

Return type:

(int, int, Counter, Counter)

Organisms.GA.SCM_Scripts.CNA_Database.similarity_profile_generator(population, offsprings, cna_database)

This is a generator that returns the clusters in the collection with the rCut values to scan across.

Parameters:

population (Organisms.GA.Population) – The population
offsprings (Organisms.GA.Offspring_Pool) – The collection of offspring
cna_database (list of float) – The list of clusters to scan across with the SCM.

Returns:

a tuple of the names of the clusters and their associated CNA profiles.

Return type:

(int, int, Counter, Counter)