Io read data
thmd.io.read_data
¶
This module contains functions to read numeric data from various formats of TEXT files.
Functions:
-
matrix_lost
–Function to read data in matrix form, in which number of values in each line are NOT equal (missing values)
-
matrix
–Function to read Data that is as a regular matrix.
-
logMFD
–Function to read data from LogMFD calculation.
-
lammps_var
–Function to extract variable values from LAMMPS input file.
-
plumed_var
–Function to extract variable values from PLUMED input file.
-
list_matrix_in_dir
–read data from all *.txt files in current and sub-folders.
matrix_lost(file_name: str, header_line: int = None, column_names: list[str] = None, comment: str = '#', sep: str = ' ', read_note: bool = False) -> pl.DataFrame
¶
Function to read data in matrix form, in which number of values in each line are NOT equal (missing values)
This cannot be read by Numpy, polars
,...
The names of columns are extracted from header_line
or set by column_names
.
If both column_names
and header_line
are not available, the default column's name is: 0 1 2...
Parameters:
-
file_name
(str
) –the text file.
-
header_line
(int
, default:None
) –the lines to extract column-names. Defaults to None.
-
column_names
(list
, default:None
) –Names of columns to extract. Defaults to None.
-
comment
(str
, default:'#'
) –comment-line mark. Defaults to "#".
-
sep
(str
, default:' '
) –separator. Defaults to " ".
-
read_note
(bool
, default:False
) –read 'note' column (any text beyond comment mark). Defaults to False.
Returns:
-
df
(DataFrame
) –polars DataFrame
Notes
- To return 2 lists from list comprehension, it is better (may faster) running 2 separated list comprehensions.
.strip()
function removes trailing and leading space in string.
matrix(file_name: str, header_line: int = None, column_names: list[str] = None, usecols: tuple[int] = None) -> pl.DataFrame
¶
Function to read Data that is as a regular matrix.
The names of columns are exatract based on column_names
or header_line
.
If both column_names
and header_line
are not available, the default column's name is: 0 1 2...
Parameters:
-
file_name
(str
) –the text file.
-
header_line
(int
, default:None
) –the line to extract column-names. Defaults to None.
-
column_names
(list[str]
, default:None
) –Names of columns to extract. Defaults to None.
-
usecols
(tuple[int]
, default:None
) –only extract some columns. Defaults to None.
Returns:
-
df
(DataFrame
) –polars
DataFrame
logMFD(file_name, dim=1) -> pl.DataFrame
¶
Function to read data from LogMFD calculation.
Parameters:
-
file_name
(str
) –the logmfd.out file.
-
dim
(int
, default:1
) –dimension of LogMFD calulation. Defaults to 1.
Raises:
-
Exception
–description
Returns:
-
df
(DataFrame
) –polars
DataFrame
lammps_var(file_name, var_names=None)
¶
Function to extract variable values from LAMMPS input file.
Parameters:
-
file_name
(str
) –the text file in LAMMPS input format.
-
var_names
(list
, default:None
) –list of varibalbes to be extracted. Default to None. mean extract all variables.
Returns:
-
df
(DataFrame
) –polars
DataFrame contains variable in Lammps file
plumed_var(file_name, var_name, block_name=None)
¶
Function to extract variable values from PLUMED input file.
Parameters:
-
file_name
(str
) –the text file in LAMMPS input format.
-
var_name
(str
) –list of keyworks in PLUMED, ex: INTERVAL,...
-
block_name
(str
, default:None
) –block command in Plumed, ex: METAD, LOGMFD. Defaults to None.
Returns:
-
value
(float
) –value of plumed_var.
list_matrix_in_dir(search_key='deform_', file_ext='.txt', read_note=False, recursive=True)
¶
read data from all *.txt files in current and sub-folders.
Parameters:
-
search_key
(str
, default:'deform_'
) –a string to search file_name.
-
file_ext
(str
, default:'.txt'
) –file extension. Default to '.txt'
-
read_note
(bool
, default:False
) –read 'note' column in pl.DataFrame. Default to False.
-
recursive
(bool
, default:True
) –search in sub-folders. Default to True.
Returns:
-
ldf
(list
) –list of DataFrames.
-
files
(list
) –list of filenames.