GA_Recording_System.py

This class will write the details of the genetic algorithm run.

Recording_Clusters.py, 02/10/2018, Geoffrey R Weal

class Organisms.GA.GA_Recording_System.GA_Recording_Database(path_of_database, max_no_of_recorded_structures=None, limit_datasize_of_database=None, ga_recording_scheme='None', limit_energy_height_of_clusters_recorded=inf, lower_energy_limit=-inf, upper_energy_limit=inf, show_GA_Recording_Database_check_percentage=False)

This is a Collection that has been designed to record clusters that have been created during the genetic algorithm.

This has been designed to give the user many ways to limit the number of clusters that are recorded. This is to prevent the size of the database from getting to big in disk size.

Parameters:
  • path_of_database (str.) – The path to this database

  • max_no_of_recorded_structures (int) – This is the maximum number of clusters that will be recorded. If this limit is reached, the higher energy clusters will be replaced in new, lower energy clusters.

  • limit_datasize_of_database (str.) – This is the maximum size the database can get. Give as a string with size + memory type (e.g. 150MB, 2.0GB)

  • ga_recording_scheme (str.) – The user can indicate a specific type of recording scheme to use to limit the clusters that are recorded. See manual for more details. Default: ‘None’

  • limit_energy_height_of_clusters_recorded (float) – If ga_recording_scheme == ‘Limit_energy_height’: is selected, this is the maximum energy above the LES energy. Any clusters that are lower in energy than Energy(LES) + limit_energy_height_of_clusters_recorded will be recorded. Any that have an energy higher than this will not be recorded. Default: float(‘inf’)

  • lower_energy_limit (float) – If ga_recording_scheme == Set_energy_limits is selected, this is the low energy limit. Any cluster with an energy lower than lower_energy_limit will not be recorded. Default: -float(‘inf’)

  • upper_energy_limit (float) – If ga_recording_scheme == Set_energy_limits or ga_recording_scheme == ‘Set_higher_limit’, this is the upper energy limit. Any clusters with an energy higher than this energy limit will not be recorded. Default: float(‘inf’)

add(index, cluster)

Adds a cluster to the Collection.

Index:

index (int/str.): the index of the ith cluster in the Collection. If “End” is inputed, the cluster will be append to the end of the Collection list. cluster (Organisms.GA.Cluster): The cluster to add at the ith position in the Collection.

add_clusters_into_RAM(cluster_dict, cluster_names)

This method adds clusters into the RAM

Parameters:
  • cluster_dict ({int: ASE.Cluster}) – This is a dicionary of all the clusters from the database, given as {cluster_name: Cluster}

  • cluster_names (list of int) – list of the names of the clusters that are needed for the collection

add_collection_to_database(collection, clusters_not_to_include)

Will record the clusters in the collection to the GA_Recording_Database.

Parameters:
  • collection (Organisms.GA.Collection) – The collection to be recorded

  • clusters_not_to_include (list of int) – A list of names of clusters in the collection not to record in GA_Recording_Database.

check_clusters_in_database(generation)

Will check the clusters in the database and remove any cluster that was created during a most recent unsuccessful generation.

Parameters:

generation (int) – The current generation

get_cluster_names(order=False)

Will provide a list of all the names of all the clusters in the Collection

Inputs:

order (bool.): This tag will tell this method whether the user would like the list of names given in order.

Returns:

List of the names of all the clusters in the Population

get_index(name_to_find)

This method will provide the index of the cluster that has the name “name_to_find” in the Collection

Inputs:

name_to_find (int): the name of the cluster in the Collection to obtain the index for

Returns:

the index of the cluster in the Collection with the name “name_to_find”

Exceptions:

Will break if the cluster with the name “name_to_find” can not be found in this method.

import_information_from_database(current_generation)

Import any data from the database if it is needed. This should only be needed for the ‘Limit_energy_height’ scheme.

Parameters:

current_generation (int) – The current generation that your genetic algorithm trial is being resumed from

remove_to_database(cluster)

Allows the user to remove a cluster in the collection from the ASE database

Inputs:

cluster (Organisms.GA.Cluster): The cluster to remove from the database.

sort_by_energy()

This method will sort the clusters in the list by their energy (from lowest energy to highest energy).

sort_by_name()

This method will sort the clusters in the list by their name.

update_cluster_in_database_for_if_in_population(clusters_in_the_population, energies_of_clusters_removed_from_the_population)

This method will update clusters in the database if that cluster was ever accepted into the population.

Parameters:

clusters_in_the_population (list of int) – This is a list of the names of clusters that are in the population.

class Organisms.GA.GA_Recording_System.GA_Recording_System(ga_recording_information)

This class is designed to record the clsuters that are created during the Organisms program run.

Parameters:

ga_recording_information – This is a dictionary that contains all the information that it needs to record clusters made during the Organisms program as the user desires.

add_collection(collection, offspring_to_remove)

This will add the clusters to the GA_Recording_System database

Parameters:
  • collection (Organisms.GA.Collection) – The collection to be recorded

  • offspring_to_remove (list of int) – This is a list of all the clusters not to write to the database, as they have been removed by the diversity operator, depending on if self.exclude_recording_cluster_screened_by_diversity_scheme is True or False.

add_metadata()

This method is designed to assign the metadata o the ASE database, as in some versions of ASE this can not happen until at least one cluster has been added to the ASE database.

check_clusters_in_database(generation)

Will check the clusters in the database and remove any cluster that was created during a most recent unsuccessful generation.

Parameters:

generation (int) – The current generation

import_information_from_database(current_generation)

Import any data from the database if it is needed. This should only be needed for the ‘Limit_energy_height’ scheme.

Parameters:

current_generation (int) – The current generation that your genetic algorithm trial is being resumed from

record_collection(collection, name_of_database, path_to_write_to)

This records an identical copy of the clusters in the population at a certain generation.

Parameters:
  • collection (Organisms.GA.Collection) – The collection to record into GA_Recording_System.

  • name_of_database (str.) – Name of the database oto connect to

  • path_to_write_to (str.) – Path of the database or folder of xyz files to write to.

record_initial_populations(population)

Will record the clusters in the initial population into GA_Recording_System.

Parameters:

population (Organisms.GA.Population) – The initial population

record_population_at_generation(population, current_generation)

Will record the clusters in the initial population into GA_Recording_System at the current generation.

Parameters:
  • population (Organisms.GA.Population) – The current population after the generation has completed

  • current_generation (int) – The generation that your genetic algorithm is up to.

resume_ga_recording_system_from_current_generation(resume_from_generation)

Will check and restore the ga_recording_system is restored for a generation.

Parameters:

resume_from_generations – The generation that your genetic algorithm is up to.

update_cluster_in_database_for_if_in_population(clusters_in_the_population, energies_of_clusters_removed_from_the_population)

This method will update clusters in the database if that cluster was ever accepted into the population.

Parameters:
  • clusters_in_the_population (list of int) – This is a list of the names of clusters that are in the population.

  • energies_of_clusters_removed_from_the_population (list of int) – This is a list of the energies of clusters that are in the population.

Organisms.GA.GA_Recording_System.convert_to_bytes(size)

Will convert the size in any disk space format to bytes.

Parameters:

size (str.) – The size of the database in any units

returns data_size: The size of the database in bytes rtype data_size: float

Organisms.GA.GA_Recording_System.get_size(size)

will convert the most human friendly version of the disk space of the database.

Input:

size (float): the disk space of the database in bytes

Organisms.GA.GA_Recording_System.make_folder(path_to_folder)

Will remake a folder, even if it already exists.

Input:

path_to_folder (str.): the path to the folder to remake.