CNA_Database.py
Meow
CNA_Database.py, Geoffrey Weal, 29/10/2018
This script holds the information required to make a CNA_database.
- class Organisms.GA.SCM_Scripts.CNA_Database.CNA_Database(rCuts, population, cut_off_similarity, get_single_CNA_profile_method, get_similarity_profile_method, no_of_cpus, debug)
This is a database that holds all the entries of CNA_Entry objects.
- Parameters:
rCuts (float) – These are the rCut values to scan the CNA across.
population (Organisms.GA.Population) –
cut_off_similarity (float) – The maximum similarity above which to be considered similar enough to exclude offspring from being accessed into future generations.
get_single_CNA_profile_method (__func__) – This is the method to get the CNA Profile, either from the T-SCM or the A-SCM.
get_similarity_profile_method (__func__) – This is the method that uses the CNA profiles of two clusters in order to give the similarity profile between the two clusters, Whether this is obtained by the T-SCM or the A-SCM.
no_of_cpus (int) – The number of cpu to use to obtain the similarity profile between two clusters
debug (bool.) – Get data to use to debug the SCM. Default: False
- add(collection, initialise=False)
Add the similarity data of the clusters in the collection to the CNA_Database
- Parameters:
collection (Organisms.GA.Collection) – The clusters to add to the CNA_Database
initialise (bool) – Are the clusters from a newly created population. Default: False
- check_database(population)
Check the database to make sure there are the correct number of entries in the database.
- Parameters:
population (Organisms.GA.Population) – The population
- get_all_averages_for_a_cluster(cluster_name)
Get the averages SCM similarities of a cluser compared to every other clusrer in the similarity profile.
- Parameters:
cluster_name (list of floats) – Cluster to get the averages similarities of.
- Returns:
all the average similarities between cluster_name any every other cluster in the database.
- Return type:
list of float
- get_details()
Retun the information about the CNA Database, including:
- Returns:
The cut_off_similarity, get_single_CNA_method, get_similarity_profile_method, no_of_cpus, debug
- Return type:
(float, __func__, __func__, int, bool)
- get_max_similarity(name_1, name_2)
This def will obtain the max similarity percentage from the compararison of two clusters, name_1 and name_2.
- Parameters:
name_1 (int) – The name of the first cluster you would like to look for an entry in the CNA_Database.
name_2 (int) – The name of the second cluster you would like to look for an entry in the CNA_Database.
- Returns:
returns the maximum similarity between cluster name_1 and name_2
- Return type:
float
- get_similar_clusters_in_database()
Get clusters in the clusters that are deemed structurally similar in the CNA_Database
- Returns:
a list of the names of all the clusters thaat are similar to each other.
- Return type:
list of int
- is_pair_in_the_database(dir_1, dir_2)
Determine if a CNA entry exists for two clusters in the similarity_profile_database
- Parameters:
name1 (int) – The dir of the first cluster you would like to look for an entry in the CNA_Database.
name2 (int) – The dir of the second cluster you would like to look for an entry in the CNA_Database.
- Returns:
True if the similarity profile for cluster name1 and name2, False if not.
- Return type:
bool
- keys()
Return a list of the names of all the clusters in the database
- Returns:
A list of the names of all the clusters in the database
- Return type:
list of int
- make_simple_table(similarity_profile_database)
This is a simpled table that can be printed to the terminal that shows all the similarities between clusters in the population + offspring
- Parameters:
similarity_profile_database (Organisms.GA.SCM_Scripts.CNA_Database.CNA_Database) – This is the CNA database that contains all similarity information of clusters in rhe population and offspring.
- print_cna_database_details()
Print information about the database.
- remove(names_to_remove)
Remove all entries that exists in the database that are associated with a particular cluster.
- Parameters:
name_to_remove (int) – The name of the cluster you would like to remove from the CNA_Database.
- reset()
Reset the CNA profiles of generated clusters and the similarity profile database.
- class Organisms.GA.SCM_Scripts.CNA_Database.Tree
This is a Tree designed for the CNA_Database to hold references of CNA_Entry.
- see_tree()
This shown the clusters in the database
- Organisms.GA.SCM_Scripts.CNA_Database.cna_profile_generator(collection, rCuts)
This is a generator that returns the clusters in the collection with the rCut values to scan across.
- Parameters:
collection (Organisms.GA.Collection) – A collection
rCuts (list of float) – The list of clusters to scan across with the SCM.
- Returns:
a tuple of the cluster the rCut values to scan across with the SCM
- Organisms.GA.SCM_Scripts.CNA_Database.initial_similarity_profile_generator(collection, cna_database)
This is a generator that returns the clusters in the collection with the rCut values to scan across.
- Parameters:
collection (Organisms.GA.Collection) – A collection
cna_database (list of float) – The list of clusters to scan across with the SCM.
- Returns:
a tuple of the names of the clusters and their associated CNA profiles.
- Return type:
(int, int, Counter, Counter)
- Organisms.GA.SCM_Scripts.CNA_Database.similarity_profile_generator(population, offsprings, cna_database)
This is a generator that returns the clusters in the collection with the rCut values to scan across.
- Parameters:
population (Organisms.GA.Population) – The population
offsprings (Organisms.GA.Offspring_Pool) – The collection of offspring
cna_database (list of float) – The list of clusters to scan across with the SCM.
- Returns:
a tuple of the names of the clusters and their associated CNA profiles.
- Return type:
(int, int, Counter, Counter)