pantea.datasets package#

Submodules#

pantea.datasets.dataset module#

class pantea.datasets.dataset.DataSourceInterface(*args, **kwargs)[source]#

Bases: Protocol

read_structures()[source]#
Return type

Iterator[Structure]

class pantea.datasets.dataset.Dataset(datasource, persist, cache=<factory>)[source]#

Bases: object

A container for Structure data with caching support.

cache: Dict[int, Structure]#
datasource: DataSourceInterface#
classmethod from_runner(filename, persist=False, dtype=None)[source]#
Return type

Dataset

persist: bool#
preload()[source]#

Preload (cache) all the dataset structures into the memory.

This ensures that any structure can be rapidly loaded from memory in subsequent operations.

Return type

None

pantea.datasets.runner module#

class pantea.datasets.runner.RunnerDataSource(filename, dtype=None)[source]#

Bases: object

The class is intended for the input data format of RuNNer consists of atomic attributes and simulation box information. Within each snapshot, there are two types of properties: per-atom properties and collective properties.

The per-atom properties encompass various attributes like the element name, positions, energy, charge, force components, and more.

On the other hand, the collective properties include attributes such as lattice parameters, total energy, and total charge.

Create a RuNNer structure data by initializing it from an input file.

Parameters
  • filename (Path) – input file name

  • dtype (Optional[Dtype], optional) – precision for the structure data, defaults to None

read_structures()[source]#

Read structures consecutively.

It must be noted that reading data in a consecutive manner from file is faster compared to indexing read. This can be used for performant preloading of structures into the memory, if needed.

Returns

Structure

Return type

Iterator[Structure]

Module contents#

class pantea.datasets.Dataset(datasource, persist, cache=<factory>)[source]#

Bases: object

A container for Structure data with caching support.

cache: Dict[int, Structure]#
datasource: DataSourceInterface#
classmethod from_runner(filename, persist=False, dtype=None)[source]#
Return type

Dataset

persist: bool#
preload()[source]#

Preload (cache) all the dataset structures into the memory.

This ensures that any structure can be rapidly loaded from memory in subsequent operations.

Return type

None

class pantea.datasets.RunnerDataSource(filename, dtype=None)[source]#

Bases: object

The class is intended for the input data format of RuNNer consists of atomic attributes and simulation box information. Within each snapshot, there are two types of properties: per-atom properties and collective properties.

The per-atom properties encompass various attributes like the element name, positions, energy, charge, force components, and more.

On the other hand, the collective properties include attributes such as lattice parameters, total energy, and total charge.

Create a RuNNer structure data by initializing it from an input file.

Parameters
  • filename (Path) – input file name

  • dtype (Optional[Dtype], optional) – precision for the structure data, defaults to None

read_structures()[source]#

Read structures consecutively.

It must be noted that reading data in a consecutive manner from file is faster compared to indexing read. This can be used for performant preloading of structures into the memory, if needed.

Returns

Structure

Return type

Iterator[Structure]